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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fer 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

15 specific domain or truncation ofthe peptides encoded by SEQ ID NO:l-1786 and 3573-5358. A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 

20 from the nucleic acid sequences of SEQ IDNO:l-1786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ IDNO:l-1786and 3573-5358that uniquely identifies or 
represents the sequence information of SEQ IDNO:l-1786and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any ofthe nucleic acid 

30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or then- 
reverse or direct complements) according to the inventionhave numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO : 1 - 1 786 and 3 573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrathet al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 
1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -1 786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l-1786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The polynucleotides of the 
1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al, Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 

antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 

polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

which comprises the step of administering to a mammalian subject a therapeutically effective 

amount of a composition comprising a polypeptide of the present invention and a 

pharmaceutical ly acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 

example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 

5 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 

4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a" 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated celJs" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a nonnal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-iike material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

the terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO: 1-1 786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a fiill match (1 -s-4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The terra "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term M vaxiant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, /.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of die residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant/' when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 
art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 > 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is repiicable, either as an extrachromosomal element, or by chromosomal integration. The 
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tenn "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1 787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO:1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and (JniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO: 1-1786 and 3 573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1 -1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et aL J. Mol Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2: 1 83 (1 983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant, PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO:l-l 786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be aprokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 

pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 

pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

1 0 Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

* Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

15 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CM V immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and 5. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic ejxzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus,, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

10 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means {e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centriftigation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
M noncoding region" refers to 5* and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3* untranslated regions). 

. Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-1 786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-ammo-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (/.*., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
20 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual (3-units, the 
strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
25 FEBSLett 21 5: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (/.&, SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. 
No. 4,987,071; and Cech et al U.S. Pat. No. 5,116,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g., Bartel et al, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al (1 992) Ann. N. Y. Acad Set 660:27-36; and 
1 0 Maher (1 992) Bioassays 1 4: 807-3 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) Bioorg Med 
15 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids 1 ' or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Perry-O'Keefe et al (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5'-(4-methoxytrityl)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem 

Ler/5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad Set U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, \9i%,Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. W09 1/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaxyotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication; a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5 r flanking 
nontranscribed sequences. DNA sequences derived from the S V40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DN A, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al, each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO: 1-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO:1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at . 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO:l 787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
15 Chcm. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a bydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexcd with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif, U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such arfinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et ah, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol, Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g, cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogms to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the eel Is 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publication No. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
International PublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwinet aL; International Application No. PCTAJS92/09627 
(WO93/09222) by Selden et aL; and International ApplicationNo. PCT/US90/06436 
(W09 1/06667) by Skoultchi et aL, each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
35 replacing the homologous promoter to provide for increased protein expression. The homologous 

36 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic anirhals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et aL, Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
35 the binding interaction. 

38 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/11, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et ah, J. Immunol. 137:3494-3500, 1 986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al, Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al, I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol I pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-ceJI effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad Sci. USA 77:6091-6095, 
1 980; Weinberger et al, Eur. J. Immun. 1 1 :405-4 11, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors anaVor cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt~3 ligand (Flt- 
3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 
inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Smce totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al. Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et aL, 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

1 0 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

1 5 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahanetal., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et ah, 
Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of ceils comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above' from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NIC cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et ah, Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et a!., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al, Science 257:789^792 (1992) and Turka et al., Proc. Natl. Acad Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1 , B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3 , In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61 :1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al, J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inabaet al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 1 69: 1 255- 1 264, 1 989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al, Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:111-122, 1994; Galyetal., Blood 85:2770-2778, 1995; Toki etal., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating. hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al, Endocrinology 91:562-572, 1972; Ling et al, Nature 321:779-782, 1986; Vale et al, Nature 
20 321:776-779, 1986; Mason etal., Nature 318:659-663, 1985; Forage etal.,Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin, Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et aL, Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al, Intl. J. Dev. Biol, 40: 1 1 89-97 (1 999) and Li et al, 

25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 



4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et ah, J. Exp. Med. 169:149-160 1989; Stoltenborg et al, J. Immunol. Methods 

175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BI Acore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 



4.10,13 DRUG SCREENING 
30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads'* via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-6$ (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 



5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

15 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention; 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

1 0 forth in Arakawa et al (1990, J. Neurosci. 1 0:3507-35 1 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g, , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 

30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 



4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al, 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 1 4, 1 5, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by I Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 



4JU EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01ng/kg to 100 mg/kg of body weight, with 
the preferred dose being about (Ujxg/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, EL-12, 
IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), ti^sforming growth factors (TGF-oc and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-l Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic fector(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician.to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceuticaily. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyxrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such earners or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutical^ compatible counter ions. Such pharmaceutical^ 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the foim of a complex of the 

protein(s) or other active ingredients) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ng to about 100 mg (preferably about 0.1 ug to about 10 mg, more preferably 
about 0.1 ug to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties/cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

1 5 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 5 o and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 
Pharmacological Basis of Therapeutics' 1 , Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which axe sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 |xg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ng/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, F a b» and 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat Acad Set USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1 988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

75 



WO 01/53312 PCT/US00/34263 
target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles a nd Practice. Academic Press, (1986) pp. 59-1 03). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fiise efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J, Immunol, 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
1 0 enzyme-linked irrimunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochenu 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-l 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13(1 994)) or by co valently j oining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab*)2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321 :522-525 (1986); Riechmann et al, Nature. 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. On. Struct. Biol.. 
2:593-596 (1992)). 

5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1 985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227:381 (1991); 
Marks et al., J. Mol. Biol.. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10. 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368. 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-9^ (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771 . It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5,13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b«)2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F (ab »)2 fragment; (iii) an F a b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 



80 



WO 01/53312 PCT/US00/34263 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit 
Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al, 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 al, Methods in Bnzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab* fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelnv etaL J. Immunol. 148(5V.1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the Vh and V L domains of one fragment are forced to pair with the complementary V L and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRm (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 



5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of fflV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 



5-13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC), See Caron et al., J. Exp Med., 176: 1 191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Imtnunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzyrnatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes, A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2l2 Bi, ,3I I, I3J In, 90 Y,and l86 Re. 
10 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
1 5 bis-(pniia2oniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatoben2yI-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

1 0 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altechul et al., J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

4 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. I (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to cany out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 1 - 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORJFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al. Science 251:1360 (1991)) or to the mKNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, GRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a rhixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be conelated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
15 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagataef al, 1985; Dahlen et al t 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 
20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories(Naperville,IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. Co vaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussener a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al, (1 983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidatebond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2M l-ethyl-3»(3-dimethyIan^opropyl)-carbodiimide(EDC), dissolved in 
15 10 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g, Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min, and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 
20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiesterlink to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparationof DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
30 Fodor*/*/. (1991)Science251(4995)767-73,incorporatedhereinbyreference. Probes may also 
be immobilized on nylon supports as described by VanNesse^A (1991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cy anuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease etal, (1994) PNAS USA91(11) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5 -protected N-acyl-deoxynucleoside phosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosraid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook et al. (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any ofthe methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook*/ 
al (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (CV/JI* *), yield a quasi-random distribution of DNA fragments form the small 
moleculepUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated the 
randomness ofthis fragmentation strategy, using a Cv/JI** digest ofpUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI* * restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4 22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one suban-ay may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarray s may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
being similar to the sort of membrane applied to the bottom of multiwell pMes, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

5.0 EXAMPLES 

5X1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PGR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDN A libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' directioa 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(Le., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virginia <t edu^ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a Ml length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the S equence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. ' 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 117, 

25 UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table 1 shows the various tissue sourcesof SEQ ID NO: 328-1413. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. - Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 5.3,2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 11 7, 

UniGene version 1 17, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 14 14-1 652. 



99 



WO 01/53312 PCT/US00/34263 
Table 1 shows the various tissue sources of SEQ ID NO: 1414-1 652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Cornp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 -examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
15 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1653-1 745. 
Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 
The homology for SEQ ID NO: 1653-1745 were obtained by a BLAST? version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
witli identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication u Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 119, gbpri 119, 
5 UniGene version 1 1 9, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 
1 0 The homology for SEQ ID NO: 1 746-1 768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 

Biol., Vol. 6 pp. 21 9-235 (1 999) herein incoiporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the positions) of the signature within the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al, Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of . 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication u 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. Amaximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank(i.e., dbEST version 120, gb pri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQIDNOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologies for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incoiporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et ah, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by HenrikNielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 

Tissue Origin" 



adult brain 



RNA Source 



GIBCO 



adult brain" 



GIBCO 



Hyseq 
Library Name 

abTooi 



ABD003 



SEQ ID NOS: 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 S87 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1366 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1506-1507 1512 
1517 1522-1524 1530-1533 1537 
1S49 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 

3 12-14 18-19 25 30-31 34-36 43- 
45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 S34 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-588 593 595 597 
601 606-609 615-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
jj>78 1081-1Q82 1085-1086 1089 
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Tissue Origin 



adult brain 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS; 



1097 

1117 

1134 

1158 

1190 

1217 

1241 

1267 

1289 

1316 

1344 

1374 

1394 

1425 

1456 

1478 

1497 

1522- 

1548- 

1565 

1591 

1611 

1630- 

1645 

1664 

1686 

1711 

1731- 

1747 

1761 



1103 

1119 

1144 

1167 

1193- 

1220 

1243 

1269 

1293- 

1320 

1348 

1377 

1400 

1427 

1458- 

1482- 

1499 

1524 

15S0 

1567 

1593 

1620- 

1632 

1647 

1667 

1690 

1719 

1733 

1749 

1765 



1107 
1121 
-1145 
1170 
■1194 
1226 
1247 
1279 
•1294 
1326 
1351 
1380 
1409 
1437 
1459 
1483 
1506 
1530- 
1552 
1569 
1595 
1621 
1636 
1649 
1669 
1694- 
1722- 
1738 
1753 
1771 



1109 
1124 
1149 
1178 
1200 
-1227 
1252 
1281 
1306 
1333 
1355- 
1386 
1414 
1443 
1468 
1487- 
1508- 
•1533 
1557- 
1571 
1S98- 
1624- 
1640- 
1653- 
1673 
1696 
1723 
1740 
1757- 
1785 



1112 
1127 
1151 
1184 
1202 
1229 
1258 
1284 
-1307 
1338 
1357 
1389 
1422- 
1446 
1470- 
1488 
1511 
1545- 
1559 
1586 
1601 
1626 
1641 
16SS 
1678- 
1701 
1726- 
1743- 
1758 



1116- 
1130 
1157- 
1188 
1215- 
1231 
1263 
1286- 
1312 
1341 
1368 
•1390 
•1423 
1454 
1472 
1493 
1517 
1546 
1563 
1588 
1608 
1628 
1644- 
1657 
1681 
1709 
1727 
1744 
1760- 



Clontech 



ABR001 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



adult brain 



ClontechT 



ABR006 



adult brain 



Clontech 



ABR008 



5-8 15-16 168 212-213 271 278 
280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1665 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



5-10 13-19 22-23 25 29 33 37-39 *" 
43-45 50-51 54-55 57-53 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 165 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



S2Q ID NOS: 



208 210 214-215 218 221-226 229" 
231-232 234-241 24S-247 251-253 
255 257-259 268-269 271 276-281 
285-286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 325-326 328 330-331 333-338 
341 344-347 349 352 354 356-357 
362 369-373 376 379-380 382 384 
387 390-391 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 453-455 462 464 
467 469-471 476 478 492-484 488- 
491 497 503 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 565-567 
572-574 577 581 585 587-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 645-647 
651-653 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 756 758-759 
762-764 766 768 773-778 780-782 
784-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 842-843 856-859 
861-862 865 867-872 874-875 881 
883-884.887 889-892 894-895 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 1057 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 1160- 
1163 1167 1169 1172 1175 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 130S-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1586-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 16S9 1671- 
1674 1676-1684 1686 1639-1690 
1694-1696 1704-1705 1708-1709 
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Tissue Origin 


RNA Source 


Hyseq 




SEQ 


ID NOS: 








Library Name 


















1720-1724 1726 


-1728 


1730-1733 








1737-1740 1742 


-1745 


1753 1756- 








1757 1759-1761 


1765 


1767 1771- 








1772 1776-1777 


1779 


-1780 1786 


auui.L Drain 


Clontecn. 


ABR011 


24 75 103 186 


210 310-311 364- 








365 


508 623 710 937 


1002-1003 


- 






1059 1204 1609 


1731 


-1732 


adult brain 


BioChain 


ABR012 


46 182-164 204 


-205 


300 739 767 









1371 1549 1620 


1684 






adult brain 


Invitrogen 


ABR013 


185 


204-205 364-365 


393 


497 595 








687 


692-694 83 


3 845 


1068 1320 


-=— = — , , - 






1413 1640 








adult brain 


Invitrogen 


ABR014 


187 


301 357 364-365 


375 


454 463 








731 


859 939 983 1073 1262 1270 








1320 1403 1640 


1651 


1657 1696 








1722 


1738 








adult brain 


Invitrogen 


ABR015 


419 


434-435 441-442 


763 


789 983 








1320 








adult brain 


Invitrogen 


ABR016 


312 


364-36S 379 1320 1334-1335 








1674 


1722 1785 








adult brain 


Invitrogen 


ABT0U4 


14-l£ 22-23 25 


37-3 


9 43 


58 60 








70-72 78 86 94 


107 


113 116 136- 








137 


143 146 152 161 


173 


182-184 








194 


196 1S8 210 218 


229 


259 267 








295 


298 309-310 320 


-321 


324 336- 








338 


346-347 349-350 


356- 


357 352 








371 


379-380 382-383 


391 


393 396 








399 


401 408 428 438 


4 59 


461 476 








482 


490 502 507-509 


516 


526 531 








557 


562 597 602 607 


-609 


624 652 








655 


667 669 671-672 


687- 


689 695- 








696 


710 712 715 721 


732 


739 743 








750 


753 766 778 780 


-781 


789 803 








814 


826 830 837 841 


857 


869 874 








894- 


895 925 937 949 


954- 


956 960- 








9S1 


963 968-969 988-989 


1000 








1005 


-1006 1016- 


1019 


1021 


1036- 








1037 


1052 1086 


1090 


1109 


1113 








1115 


1120-1121 


1123 


-1124 


1136- 








1137 


1140 1144- 


1147 


1151 


1167 








1170 


1174 1188 


1193- 


-1194 


1205 








1225 


1229 1231 


1254 


1258 


1262 








1280 


1285 1309 


1312 


1334 


-1335 








1341 


1343-1344 


1356-1357 


1370 








1378 


-1379 1383- 


1384 


1403 


-1404 








1423 


1429 1434 


1442 


1448 


1451- 








1452 


1454 1470- 


1472 


1482 


1499 








1525 


1528-1529 


1532 


1536 


1547 








1554 


15S7-1559 


1561-1562 


1567 








1585 


1588 1590 


1595 


1601 


-1604 








1608 


1610-1613 


1615 


1619 


1624 








1627 


1640 1644 


1647 


1660 


1664 








1666 


1670 1675 


1696 


1704 


1715 








1723 


1727 1738 


1760- 


1761 


1768 








1779 


1785-17B6 








cultured 


Strategene 


ADP001 


5-8 11 17 25 68 


-69 80 82 


87 103 


pre adipocytes 






105 110 116 136 


-138 


368 171 188- 








189 196-198 261 


267 


276 288 293 








301 318 331 336 


-338 


379-380 391 








400 428 430-431 


510- 


512 520 524 








527 549 557 561 


602 


618 620 622 








631 637 647 670 


681- 


682 710 731 








748 782 793-794 


817 


834-636 843 








845 858-859 879 


882 


893-895 934 








960 982 986 995 


-996 


1000 


1002 








1005-1007 1025 


1027- 


1028 


1032 








1039 


1045 1071 


1078 


1097 


1099- 








1102 


1136-1137 


1140 


1219-1220 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



1271 
1329 

-1371 
1466 
1608 
1662 
1719- 

•1761 



1260 
1322 
1370 
1437 
1602 
1660 
1711 
1760 



1297 
1339 
1398 
1468 
1614 
1673 
1720 
1765 



•1298 
1345 
1408 
1533 
1631 
1687- 
1742 
1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



1320 
1366 
1431 
1594 
16S0 
1696 
1749 
1785 



adrenal gland 



CI on tech 



ADR002 



4-10 15-16 25 29-31 43-4$ 47 50- 
51 55 60 €2-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 251 255 267-269 271 280- 
281 285 295 298 311 336-338 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
434-437 439 445 454 461 473 
493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 7D3 
732 734 744-746 758 
816 829 B37 845 848 
B98 904 912 922-923 
948 952 965 967 969 
990 992-993 1001 



431 
477 483 491 



713 715 719 
773-778 789 
B69 875 883 
930-931 942 
976-977 981 



1004 


1049 


1055 


1059 


1071-1072 


1076 


1112-1113 


1115 


1121 


1127 


1134-1135 


1151 


1158 


1163 


1175 


1181 


1188 


1209 


1218 


1224-1225 


1227 


1231 


1243 


1270-1271 


1274 


1280 


1285 


1290 


1293 


1307 


1324- 


1325 


1327 


1330 


1342- 


-1343 


1345 


1348 


1355- 


-1366 


1369 


1378- 


-1379 


1387 


1398 


1400 


1405 


1417 


1425- 


1426 


1436 


1440-1441 


1444 


1454 


1463- 


•1464 


1488 


1491 


1507 


1512 


1538 


1546 


1567 


1573- 


-1575 


1588 


1598 


1609 


1614 


1618 


1622 


1624 


1627 


1634 


1636 


1649 


1651 


1658 


1671 


1674 


1678- 


-1679 


1691- 


•1692 


1703 


1717 


1727 


1731- 


•1732 


1737 



adult heart 



1765 



GIBCO 



AHR001 



4-8 10-11 15-16 18-21 34-39 44- 
46 50-52 57-58 60 62-63 71 75 82 
85 87 89 94 97 100 103-104 108- 
110 112 114 116 118-119 122-123 
127 130-132 134 136-138 141-144 
147-151 153 163-164 168-171 179 
186 192 195 197 199 204-205 212- 
215 220 225-226 229-230 232 234- 
236 251 257-260 262 265 272 274 
277 280-2B2 28S-286 289-292 296 
298-301 304 307 309 314 321 324- 
325 330 333 336-338 345 349 351- 
352 354 358 361 368 370 380 383- 
384 387-338 391 393 397 401 406 
40B-409 411-412 414-416 430-431 
433-439 445-446 449 452 454-455 
457 459 462 469 472-473 476-480 
483-484 487-490 492-493 496-498 
503 506 508 510-513 516 519-522 
526 534 536-540 542 546 549 553 
560-562 574-577 581-582 584 586- 
587 589 593 595 597 604-609 611- 
612 615-620 622-623 626 632 637 
645-652 656-660 665-666 670-672 
674-675 683-684 687 692-694 697 
701 709 712 715-716 719-720 725- 
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Tissue Origin 



RNA Source 



adult kidney 



Hyseq 
Library Name 



SEQ ID NOS: 



726 728 730-732 735 
744 746 751 753 759 
771 775-780 785 788 
804 810 812 817 821 
637 843 845-847 849 
863-864 869 871 875 
883 887 890-892 894 
901 903 906-907 911 
921-925 927-928 933 
961-963 967 969-972 
980-986 990 992 999 
1007 1010 1016 1019 
1023 1025 1028-1037 
1043 1047 1050 1054 
1059 1063-1064 1C67 
1072 1075-1076 1083 
1089 1093-1094 1104 
1109 1113 1116-1117 
1124 1126 1128 1131 
1145 1148-1149 1151 
1169-1170 1175 1177 
1199-1200 1202 1206 
1216 1218 1222 1227 
1235 1238-1241 1243 
1248 1250 1253-1254 

1261 1268 1270-1271 

1262 1287 1292 1298 
1308 1317-1321 1324- 
1332 1334-1337 1339 
1349-1350 1354-1356 
1365-1366 1369 1371 
1378-13B0 1383-1384 
1400 1403 1409 1417 
1437 1439 1442 1444 
1450 1453 1468 1470 
1481 1488 1490 1501- 
1521 1524 1528 1530- 
1537 1539 1541-1542 
1555 1560* 1565 1567- 
1591 1S97-1598 1601- 
1614-1616 1619-1620 
1630-1632 1634 1636 
1645 1647 1649 1652- 
1662 1667 1673-1674 
1684 1686-1688 1704- 
1711-1712 1717 1724 
1731-1733 1737-1738 
1744 1749 1754-1755 
1765 1772 1785 



738-739 743- 
761 765 770- 
790 796 802 
826 828 830 
853 857-861 
877-879 881 
-895 897-898 
-913 915 919 
935 945 958 
975 977-978 
1002 1005- 
-1020 1022- ' 
1039-1040 
-1055 1057 
•1068 1070 
1085-1087 
1106 110B- 
1119 1121 
-1134 1144- 
1158 1167 
1192 1196 
•1208 1211 
•122S 1232- 
■1244 1247- 
1256-1258 
1277 1280- 
•1299 1306 
1325 1330 
1344-1345 
1359-1360 
1374-1375 
1389 1397 
1423-1426 
1446-1447 
1473 14 79 
1504 1519 
1534 1536- 
1547 1553 
1571 1588 
1602 1605 
1623-1628 
1641 1644- 
1655 1659 
1680-1681 
1705 1709 
1726-1727 
1741 1743- 
1760-1761 



GIBCO 



AKD001 



4-8 10-11 1 
45 50-51 56 
77 80 82 85 
104 107-108 
127-133 136 
147-154 157 
172 176 178 
201 203-206 
216 223-228 
253 257-259 
272 274 276 
290 293 296 
307 311-313 
333 341 344 
359 362 364 
376-377 380 
401 404 407 
430-437 443- 
455 459 461- 
474 476-477 



7-21 29-31 35-39 42- 
58 60-61 64 68-69 75 
87 92-94 97 100 102- 
112 116-117 119 123 
-137 139-141 143-144 
161-163 165-166 169 
-179 192 194-197 199 
209-210 212-213 215- 
234-236 238 247 251- 
261-262 265-269 271- 
-277 279-281 234-286 
298-299 301-302 304 
321 325-326 329-331 
348-350 352 356 350- 
365 368 370-372 374 
382 392 395 398 400- 
409 414-415 423-424 
444 446 449 451 453- 
462 464 467 469 471- 
480-481 483 487-488 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



adult kidney 



Invitrogen 



490-491 493 497-505 
520 522 524 526-529 
544 547 549 SS4-5S6 
567 571-576 578 582 
593 598-599 601 604 
615-619 621-626 632- 
645-652 655 660-664 
678-679 688 692-695 
713 717 719-720 727 
738 743 745-746 751 
763 765 771-773 775- 
788 793 795-796 800 
810-812 814-819 821 
834-838 842-645 848- 
864-865 867 869 871 
836-887 889-891 893- 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-958 
964 969-970 972 976- 
908-990 992-993 995« 
1004-1008 1010 1012 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057- 
1070-1073 1078 1085- 
1089 1092 1094 1097 
1107 1109-1112 1116- 
1123-1125 1132-1135 
1143 1146-1147 1149- 
1154 1157 1159 1163 
1178-1179 1181 1183 
1200 1202-1204 1206- 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257- 
1261 1267-1268 1270 
1281 1283 1287-1289 
1299 1306 1308 1311- 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 1367 1369 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-1431 1433 1437- 
1443 1445-1446 1448- 
1454 1459 1461 1465- 
1475 1478.1484-1488 
1493 1495 1497-1498 
1509 1512 1518 1521- 
1527-1528 1532-1533 
1541 1547-1550 1552 
1561 1565-1566 1568 
1578-1579 1583 1586- 
1591-1592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624- 
1632 1634-1636 1638- 
1644 1646-1649 1653- 
1664 1666-1667 1670- 
1679 1683-1684 1686 
1696-1699 1701 1709- 
1714 1716-1719 1723- 
1727 1733 1737-1738 
1744 1748-1749 1751 
1763-1768 1778 1780 



510-513 516- 
534 537-540 
560 562 564 
586-589 592- 
•606 6C8-613 
■634 637-643 
669-672 676 
698 702 711 
731 735-736 
753 755 762- 
•778 780 786 
803 805 808 
826 829 832 
•855 857-861 
874 876-883 
■896 898-900 
918 920 922 
940-942 945 
960-961 963- 
■978 982-986 
•997 999-1002 
•1013 1016- 
1025-1031 
1044 1047 
•1064 1068 
•1086 1088- 
1099-1102 
1119 1121 
1140 1142- 
■1150 1153- 
1167 1170 
1192 1196- 
•1211 1216- 
1227-1230 
1243-1244 
1258 1260- 
1272-1274 
1293-1295 
•1313 1317- 
1334-1335 
1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1438 1442- 
1450 1453- 
1468 1474- 
1490 1492- 
1506-1507 
1522 1525 
1537 1540- 
1556-1559 
1571 1575 
1587 1589 
1600 1603- 
1613 1615- 
1628 1631- 
1639 1641 
1656 1662 
1671 1676- 
1691-1692 
1711 1713- 
1724 1726- 
1741 1743- 
1760-1761 
1785 



AKT002 



20-21 37-39 47 52 57 60 65-66 
68-69 80 104 107-108 122 130 133 
136-137 140 142-143 149 169 174 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult lung 



181 197 227-228 235-236 244 2£l" 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 387 392 401 414 416 421 430 
443 445 449 453-454 472 437-488 
504 506 513 516 519 522 528 536- 
540 546 S54 585 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
838 849-850 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1365 1378-1379 1403 1414 
1419 1428-1429 1436 1446 1458 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 1686 1709 1727 1740 
1776 



GIBCO 



ALG001 



lymph node 



4-8 14 37-39 44-46 
63 75 B2 88 93 103 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-S40 544 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 



50-51 56 62- 
104 113 125 
154 157 162 
191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
-477 488 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
837-838 845 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1522 



1525 1531-1532 1547 1549 1553- 
1554 1571 1598 16C6 1613 1624 
1627-1629 i632 1642 1644 1662 
1569 1676-1677 1684 1696 1727 
1731-1732 1737-1738 1748-1749 
1786 



Clontech 



ALN001 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481- 
482 503 526 529 537-540 546-547 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



621 626 649 679 "719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



young liver 



GIBCO 



ALV001 



adult liver' 



invitrogen 



S-8 11 20-21 46 50-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
12B3 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1555 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-176S 1769 



ALV002 I 5-8 17 20-21~32-33 41 55 58 64 

75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 27S-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 926 946 94B 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
11S9 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Origin 



RNA Source 



Hyeeq 
Library Name 



SEQ ID NOS: 



adult liver 



Clontech 



1550 
1597 
1618 
1647 
1669 
1738 
1765 



1567 
1601- 

•1619 
1652 

•1671 
1742- 
1772 



1578 1581 
1602 1611- 
1621 1625 
1654-1655 
1684 1706 
1744 1760- 
1774 



1583 1594 
1612 1615 
1637 1645 
1660 1666 
1722 1737- 
1761 1753- 



adult ovary 



ALV003 



Invitrogen. 



29 676 997 1063 1119 1536 1766 



AOV001 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-2B6 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 579 581 S03 585- 
588 590-591 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 7S0-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 ' 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1100 1183-1185 1190-1191 119S 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1299 
1323 
1338 
1359 
1377 
1394 
1427 
1443 
1463 
1481 
1494 
1507 
1526 
1538 
1S53 
1567 
1578 
1591 
1609 
1636 
1657 
1671 
1690 
1713 
1726 
1738 
1751 
1765 
1778 



1306 
1327 
1339 
1361 
1375 
1400 
1429- 
1445- 
1464 
1484- 
1496- 
1511- 
•1527 
1539 
1555- 
1569- 
1580- 
1595 
1611- 
1638 
1659- 
1673- 
1699 
•1714 
•1728 
1740- 
1753 
1767- 
•1779 



1308 

1329- 

1341 

1365- 

1383- 

1404 

1431 

1450 

1466 

1485 

1498 

1517 

1530- 

1541 

1559 

1570 

1581 

1597- 

1621 

1641 

1662 

1674 

1702- 

1716- 

1731- 

1741 

1755- 

1768 

1783- 



1312 
1330 
1343- 
1366 
1384 
1416- 
1435 
1453 
1468 
1488 
1501 
1519 
1531 
1546 
1561 
1572 
1587 
1598 
1623 
1643 
1664 
1676 
1707 
1719 
1733 
1743 
1756 
1770 
1784 



1317- 

1332- 

1351 

1371- 

1386 

1417 

•1436 

•1454 
1470 
1491 
1504 
1521- 
1534- 
1548- 

•1563 
1574- 
1588 
1600- 

•1630 
1645 
1667 

-1681 
1710- 
1723- 
1735 

-1744 
1760- 

-1771 
1786 



1321 

1333 

1356 

1375 

1389 

1422- 

1439- 

1459 

1474- 

1493- 

1506- 

1524 

1536 

1550 

1566- 

1575 

1590- 

1606 

1634 

1647- 

1669- 

1683- 

1711 

1724 

1737- 

1748- 

1762 

1776 



5-8 44-45 90-91 107 
311 351 414 476 503 
636 719 755 773 860 
947 955-956 962 990 
1045 1202 1320 1369 
1713-1714 1743-1744 



108 159 178 
545 574 624, 
890-891 924 
992 1002 
1628 1686 



adult placenta 



Clontech 



APL001 



APL002 



placenta 



Invitrogen 



14-16 26 29 43 60-61 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 953-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 1285 1317- 
1345 1429 1435 1438 
1486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 



79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



adult" spleen 



GIBCO 



ASP001 



3 5-8 12 15 
44-45 57 60 
103 106 108 
147 152-153 
17B-1B0 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574 
611-612 620 
652 659 661 
700 721 728 
746 762 
810-811 
852-853 



765 
817 
858 



16 19-21 
82-83 87 
117 119- 
155 166 
198 201- 
253-254 
290 295 
349 358 
414 431 
481 490- 
530 534 
576 582 
621 623 
667 671 
730 732 
774 780 
822 830 
862 866 



24 29 34-36 
89 94 98-99 
121 139 141 
169 171 174 
206 209-211 
256 258 264 
302 309 312 
372 382 386- 
434-436 446 
493 500 503 
536-540 547 
592 595 604 
631-632 642 
673-675 684 
738 742-744 
788-789 794 
832 845 848 
874 879 882 
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RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










884 906-908 912 


919 


921-923 925- 


■ 






927 934 94 


2 949 


957- 


958 963 977- 








978 983 990 992 


-994 


996-997 999 








1005- 


1007 


1010 


1012 


1031 


1036 








1042- 


1044 


1046 


1049 


1059 


1068 








1070 


1076 


1089- 


1090 


1094 


1103 








1109 


1113 


1115 


1154 


1140 


1163 








1170 


1174 


1177 


1190 


1196 


1219- 








1220 


1226- 


1227 


1229 


1236 


1241 








1246 


1258 


1269 


1271 


1274 


1295 








1301 


1320 


1322 


1330 


1334- 


1335 








1339 


1349 


1351 


1353 


1359- 


1360 








1364 


1369 


1374 


1386 


1397 


1413 








1417 


1434 


1436- 


1437 


1439 


1468 








1474 


1477 


1480 


1485- 


1487 


1498 








1512 


1522 


1525 


1544- 


1549 


1553 








1560 


1567 


1591 


1600 


1631 


1636 




- 




1651 


1654- 


1655 


1658 


1662 


1670 








1674 


1678-1679 


1684 


1686 


1700 








1727 


1733 


1738 


1740- 


•1741 


1760- 








1761 


1774 


1779 


1781- 


1782 






GIBCO 


ATS001 


5-8 10 26 


30-31 47 50-51 


57 68- 








69 82 84-85 97 


102 113 119 137 








139 150 152 154 156 


163 169 174 








176-177 192 194 196- 


-197 212-215 








227-228 247 255 258 


261 282 285 








288-289 301 307 311 


316 330 334 








349 370-372 392 398 


410 415 426- 








427 430-431 433 437 


446 454 461 








469 473 477 481-482 


493 499 502- 








503 513 522 526 547 


552-553 563- 








564 


572-573 575-576 


581-582 585 








599-602 605 612 615 


-617 620 631 








637 


647 649-650 656 


660 


565 670 








674- 


575 712 719-721 


723 728 731 








738 


744 746 773 780 


784 


7 8J8-7B9 








802 


804 809 811 814 


826 


831 837 








843 


645 848 859 866 


869 


877 905 








913 


916 919 921 926 


929 


937 950 








960 


963 971 975 977 


981 


990 992- 








993 


1007 1016 1029- 


1030 


1034- 








1035 


1038 


-1039 


1045 


1059 


-1060 








1064 


1070 


1072-1073 


1087 


1089 








1097 


1099 


-1102 


1104 


1108 


1113 








1141 


1149 


1161 


-1162 


1175 


1208- 








1209 


1222 


1227 


1229 


1231 


1235 








1238 


-1239 


1243 


1253 


1285 


1287- 








1289 


1291 


-1293 


1307 


1311 


1317- 








1320 


1330 


1332 


1338 


1345 


1369 








1373 


-1374 


1379 


1389 


1399 


-1400 








1409 


1423 


-1424 


143 0 


1435 


-1437 








1443 


1459 


1484 


1486 


1490 


1493 








1496 


-1497 


1501 


1505 


1509 


-1513 








1527 


1530 


-1531 


1533 


1537 


1546 








1549 


1563 


1565 


1567 


1569 


1571 








1577 


1586 


1591 


1599 


1602 


1625 








1628 


1630 


-1632 


1636 


1639 


1642 








1649 


1661 


-1662 


1666 


-1667 


1670 








1675 


' 1684 


1690 


1699 


1705 


1712 








1717 


1724 


1730 


1737 


-1738 


1752 








1767 


1779 










Genomic dna 
from BAC 63118 


Research 
Genetics 
(CITB BAC 
Library) 


BAC001 


-684 


13S"2 


1412 








Genomic DNA 
from BAC 3 93 16 


Research 
Genetics 
(CITB BAC 
Library) 


BAC002 


1411 


-1412 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Genomic DNA 
from BAC 39316 



adult bladder 



Research 
Genecics 
(CITE BAC 
Library) 



BAC003 



1352 



Inv.it rogen 



BLD001 



bone marrow 



Clontech 



BMD001 



5-8 1718 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 353 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 789 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055" 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 S49 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 765-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-B67 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 932 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



3EQ ID NOS: 



1506 

1526 

1546 

15S7- 

1592 

1626- 

1638- 

1653- 

1684 

1713- 

1727 

1772 



1509 
1528 
154B- 
1559 
1597- 
1628 
1639 
1655 
1686 
1714 
1737- 
1781- 



bone marrow 



1513 

1531 

1549 

1571- 

1600 

1630- 

1641 

1661- 

1690 

1717 

173 8 

1782 



1S21 
1536 
1552 
1572 
1609 
1632 
1646 
1662 
1702 
1720 
1740 
1785 



•1522 

-1537 
1554- 
1581 
1614 
1634 

•1647 
1676- 
1707 
1722- 
1758 

•1786 



1524 

1543 

1555 

1589- 

1621 

1636 

1651 

1681 

1711 

1723 

1767 



Clontech 



BMD002 



11 15-16 19 30-31 35-36 68-69 75 
83-B4 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 565 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 B30 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 99B 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 135S-13S7 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574. 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1623 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



bone marrow 



Clontech 
Clontech 



BMD004 



73-74 503 922 1036 1711 



bone marrow 



EMD007 



95-96 866 1320 1475 



adult colon 



Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Orxgm 



Mixture of 16 
tissues - 
mRNAs 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1462-1464 1S12 1556 1*83 1587" 

1594 1S96 1614 1625-1626 1631 

1639 1645 1650 1675-1677 1687- 

1688 1701 1713-1714 1724 1740 
1765 



Various 
Vendors 



CTL016 



401 1490 16B6 



Mixture of 16 
tissues - 
mRNAs' 
adult cervix 



Various 
Vendors 



CTL021 



312 782 1132-1133 1403 1712 1715 



BioChain 



cvxooi 



1 4-8 11 13 18-21 2^-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
136 198 201-2D2 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
325 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 557 561 572-573 575-577 581- 
582 585-586 5B8-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Inviirogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphattastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



diaphragm 



BioChain 



DIA002 



endothelial 
cells 



Strategene 



EDT001 



1406 1416 
1437 1442 
1466 1472 
1503 1506 
1531 1533 
1585 1589 
1609 1614- 
1626-1628 
1649 1653 
1674-1675 
1702 1709- 
1724 1729 
1741 1743- 
1760-1762 
1786 



T4?5" 
1446 
1478 
1512 
1541 
1597 
1616 
1630 
1656 
1683 
1710 
1731. 
1744 
1767 



T42T 
1448 
1482 
1522 
1547 
•1598 
1620 
163 8 
1662 
1685- 
1715 
1732 
1748- 
1773 



1431 

1453 

1496 

1527- 

1569 

1600 

1623- 

1641 

1667 

1688 

1717 

1735- 

1749 

1778 



1436^ 

1459 

1501- 

1528 

1571 

1608- 

1624 

1643 

1669 

1699 

1722 

1739 

1755 

1785- 



137 282 2 
1478 1599 



89 730 780" 
1614 



986 1409 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 S8-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 1G8 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 152-153 
161-163 166-172 176-179 107 190 
192 194 196-201 204-207 210 212- 
214 220 224 225-230 233 235-236 
240-241 251-252 258 261-262 265 
267-269 272 276-277 279-281 284- 
285 288 290 295-296 301-302 310- 
311 313 316 321 325 329 331-333 
33S 340-342 351-355 360 371 375 
380-382 384 387 390 392 397 400 
407-408 410 412 414 416 425-427 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
519 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 
572-576 579 581 585-586 589 593 
595 597 599 603 607-612 615-617 
620 622 626 630 632-634 638-641 
644 647 656-560 662-664 670 $73 
678 680-682 692-697 707 709-710 
712-713 719 730 732 734 736 738 
743-746 751 759 768 771 772 775- 
778- 783 786-789 793 800 803 805- 
807 810-811 814 816-818 821-822 
824 826 828-829 832 834-836 842- 
845 848-850 B54-B60 862 864 869 
871 874 876-879 883 885 887 890- 
891 894-895 898-900 903 908 910- 
913 916 919-922 924 926-928 930- 
935 939 943 948-949 951-954 957 
959-961 964 959-970 973 975-978 
983-904 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 1205-1207 1211 
1216-1217 1219 1221 122S 1229 
1232-1235 1238-1241 1243-1244 
1246 1250 1253 1257-1258 1261 
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Tissue Origin 


RNA Source 


Hysecj 

T . 4 Kva ~r\f Mama 


SEQ ID NOS: 








1265 


-1266 1268 1270- 


1271 


1274- 








1277 


1280-1283 1285- 


1286 


1288- 








1290 


l^yj 1273 12?o 


1JUB 


l j i<i 








1317 


-1320 1324-1325 


1327 










1330 


1334-1335 1338 


1342 










1345 


-1347 1350 1355- 


1356 


1 ICQ 








1367 


1369 1374 1376 


1379 


13 98 








1400 


1406 140B 1414 


1417 


19 iy 








1424 


-1426 1428-1431 


1434 


-1438 








144 0 


-1442 1448 1450 


1462 


- 14 ob 








1468 


1472 1474 1478 


1487 


-1488 








1491 


-1493 1501-1504 


1506 


1509 








1511 


1516 1520-1521 


1526 


1529 








1531 


1536-1537 1539-1540 


1546- 








1547 


1549 1552 1555 


1557 


-1559 








1S61 


-1565 1568 1571 


1575 


1578- 








1579 


1581-1583 1587- 


1588 


1590 








T COO 


1597 1605-1606 


1611 


1613 










1618-1621 1624- 


1628 


1630- 








lb J 1 


1634 1636 1638 


1641 


1643- 








1650 


1652-16S9 1664 


1666 


-1667 








ICCQ 

ibby 


1671 167S-1681 


1683 


-1688 








1696 


-1698 1703 1711 


1715 


-1716 








1719 


1722-1723 1726 


1731 


-1733 








1736 


1739-1741 1743- 


-1744 


1749 








175S 


1760-1761 1765 


1767 


-17G8 








1771 


-1773 1776 1779 


1783 


-1786 


Genomic clones 


. — -. — - 

Genomic DNA 


&jri*iuui 


286 


686 1297 1303-1304 1352 


from the short 


from 




1411 


-1412 1754 






arm of 


Genetic 












UJli. UlilwowlllO O 


Research 












ecophacfuo 


a L o i_r.£i in 


bOwVU £ 


131- 


132 261 289 380 


503 


860 092 






1000 


1007 1397 






cecal orain 


Clontecn 




62-63 89 112 126 194 322 


336-338 








379 


391 411 481 546 


563 


607 679 








710 


867 1012 1031 1055 1251 1262 








1320 


1407 1643 1652 


1686 


1731- 




; _— - 




1732 


1746 1765 






fetal brain 


Clontecn 


t* OKU Lf* 


68-69 90-91 139 212 


-213 


301 331 








362 


374 403 436 611 


645- 


646 659 








668 


670 691 785 805 


845 


1163 








1209 


1216 1232-1233 


1238 


-1239 








1387 


1410 1416 1430 


1496 


1536 








154 7 


1S93 






fetal brain 


ciontecn 




5-9 


25 43 60 62-63 


55-66 


70 72 








00 87 92 101 103 108 114 


136 139 








149 


1S2-153 157 168 


171- 


172 175 








207- 


208 210 212-213 


221- 


226 237- 








238 


251-253 266 272 


279- 


281 295 








301- 


302 307 310 317 


-31B 


321-324 








330 


333-334 336-338 


346- 


347 352 








357 


370 373 377 379 


-380 


382 384 








391- 


392 397 399 402 


406- 


408 410- 








411 


417 421 424 426 


-427 


430 436- 








437 


440-443 454 460 


464 


467 473 








476 


483 488-489 495 


497 


508 510- 








513 


516 519-520 524 


530 


537-540 








544 


547 550 561 567 


572- 


574 582 








590- 


591 595 597 604 


607- 


609 615 








623 


628-629 631 634 


638- 


640 655 








657- 


658 660 665 669 


674- 


675 679 








689 


691-694 696-697 


699 


701 706 








710 


716 720 728 732 


734 


736 742- 








744 


757-760 763 775 


-778 


780 799 








806- 


807 810 817-818 


826 


839 843 








858 


861 864 871-872 


884 


890-891 








894- 


895 890 904 915 


921- 


923 935- 








936 


938 945 950 952 


955- 


956 958- 








959 


961 963 967 969 


-971 


990 992 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



999 1001 1 
1016 1022 
1035 1042 
1065 1067 
1114-1115 
1151 1153 
1172-13 73 
1190-1200 
1226-1227 
1253-1255 
1270-1273 
1314 1317- 
1339 1341 
1371 1373 
1386 1392 
1425-1426 
1440-1441 
1502-1503 
1519 1536 
1559 1573 
1611-1614 
1640 1651 
1693 1696 
1718 1720 
1730-1733 
1742 1745 
1767 1771 
1786 



005-1006 1 
1024 1029 
1047-1048 
1070 1082 
1119 1131 
1156 1160 
1178 1184 
1211 1216 
1229 1231 
1258 1260 
1281 1287 
1320 1326 
1344 1350 
1376 1379 
1396-1398 
1428-1429 
1448 1466 
1507 1511 
1544 1549- 
1589-1590 
1619 1621 
1657-1658 
1703-1704 
1722 1724 
1735-1736 
1755 1759 
1772 1777 



008 1013 
1030 1032 
1052 1056 
1089 1109 
1143-1149 
1163 1167 
1186 1188 
1222-1223 
1236 1245 
1262 1266 
1308-1309 
1334-1335 
1356 1369- 
13B1-1382 
1419 1423 
1432 1437 
1470 1482 
1513 1516 
1550 1557- 
1598 1608 
1625-1626 
1676-1679 
1713-1714 
1726 1728 
1736-1739 
1761 1765 
1779-1780 



235-236 520 864 1068 1188 1587 
15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 S19 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 111B 1120' 112B 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 17S7 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal brain 



Clontech 



FBRS03 



fetal brain 



Invitrogen 



FBT002 



fetal heart 



Invitrogen 



PHR001 



105 124 180 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



FKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin ) RNA Source 



fetal kidney 



Hyseq 
Library Name 



SEQ ID NOS: 



258 277 280-281 307 310 314 330 
371 387 392 395 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 760 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 



Clontech 



FKD002 



fetal kidney 
fetal lung 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



Invxtrogen 
Clontech 



FKD007 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



FLG001 



fetal lung 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-e95 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



Invitrogen 



7LG003 



fetal lung 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
468 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 13B1 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1S57-1560 1567 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



Clontech 



FLG004 



fetal liver - 
spleen 



~103~276 334 465-466 737 843 1131 
1614 1658 



Columbia 
University 



FLS001 



3-11 13 15 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299 
311 314 316 
332 342 344 
358 360 362 
386-387 390 
406 408 410 
437 439-442 
456 459 461- 
487-488 490- 
506 S09-513 
529 531 534 
553-554 561- 
576 579 581 



21 25 30-39 41-48 50- 
60-66 68-69 72 75 
85 87 89 92-103 105- 
-124 126-127 130 133 
144 147-149 152-153 
167-172 174 176-178 
-190 193-194 196 198- 

210-214 219 221-231 
-244 246-247 250-251 
261-265 268-269 272 
280-281 284-286 288 
-301 304 306-307 309 
318 320-321 326 329- 
•345 350 352-353 356- 
370-374 376 378-384 
392-393 400-401 403 
412 415 417 419 422- 
444-445 448 452-454 
470 472-479 481-483 
491 493 500-501 503- 
515-520 522-524 526- 
536-540 542 S47-549 
562 564 567-568 571- 
583 585-597 599-605 
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Tissue Origin I UNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-719 725-728 730- 
731 734 736 738 740-741 743-746 
748 750-751 759-766 768 772 7<74- 
777 779 783-788 793 796 798 800- 
805 808 010-812 814 81B-819 821- 
824 826-832 834-837 843-847 849- 
367 869-876 878-883 887 889-895 
697-898 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
960-961 963-965 967 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-L056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1S11-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1S81 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 173B 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



fetal liver- 
spleen 



Columbia 
University 



PLS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-S8 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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Tissue Origin 


RNA Source 


H/seq 








SEQ ID NOS : 






































215 219-221 


O 


225- 


229 








231- 


232 


240-244 


246- 


t 


250- 


251 








258- 


259 


262 264 


268- 


269 


272 


27S 








277 


280- 


281 284 


286 


288 


290- 


292 








295 


298- 


299 301- 


-304 


306 


308- 


310 








318 


320- 


321 323 


325 


329 


331 


334 








342 


348- 


34 


9 352 


-353 


356 


359 


368 








371 


374 


376-379 


381- 


384 


386- 


387 








392- 


393 


397-398 


400- 


401 


403 


410- 








413 


421 


423 426 


->427 


429- 


430 


433- 








436 


438 


440 443 


445 


448 


451-452 








4S4-455 


460-463 


465- 


467 


469 


H 1 X - 








473 


475-476 478 


•479 


481-483 


*kO f 








490- 


-491 


493-494 


497 


500-501 


eft - *- 








505 


509- 


-513 515 


-517 


519-520 


CO A 








526-531 


534 537 


-542 


544 


547 


ceo 








554 


556 


558 561 


-562 


564-567 


571- 








577 


583- 


-587 590 


-591 


593 


595 


597 








601 


604-606 608 


-613 


616- 


-617 


619- 








624 


626-632 634 


637- 


642 


644 


647 








649-652 


654-659 




665 


669 


-672 








674- 


-675 


681-682 


685 


688 


690 


696 








698 


700 


-703 707 


709- 


710 


713 


717 








719-721 


723-724 


728 


731-732 


734 








737-738 


74 


2-745 


748 


752 


754 


759 








763 


-766 


768 770 


773* 


•777 


780 


782 








784 


786 


791 795 


-798 


801 


-802 


805 








808 


811 


-812 818 


823- 


-824 


826 


-827 








832 


834 


-83 


7 839 


843 


846 


848 


-856 








858 


-861 


865 867 


869 


871 


873 


-874 








876 


878 


881-882 


887 


889 


892 


894- 








898 


901 


-902 904 


906-908 


913 


-915 








919 


921 


-924 926 


-932 


934 


-935 


937 








939 


-941 


943 946 


-947 


950 


953 


958 








961 


965 


-967 971 


973-975 


977 


-979 








981 


984 


-985 990 


992 


-993 


995 


-997 








999 


1001 1004-1007 1009 


-1011 








1013 1016 


1020 


1023 


1025 1027- 








103 


1 1033- 


1035 


1039 


-1042 1044- 








104 


5 1049 


1053 


1055 


-1056 1058- 








1059 1062 


1064- 


1065 


1067-1070 








1072-1074 


1079 


1082 


1087 1089 








1093 1097 


1099- 


1103 


1105-1107 








1109*1114 


1123 


1125 


-1127 1132- 








1134 1140 


1143- 


1145 


1148-1150 








1156 1158 


1160 


1163 


1172-1173 








1177-1178 


1181- 


1184 


1190-1192 








1195-1197 


1199 


1204 


1206 1208 








1211 1214 


1216 


1219 


1227 1230 








1234-1235 


1237 


1240 


-1241 1243 








1245 1247 


1256 


1258 


1260-1261 








1264 1268 


1270- 


1271 


1275 1278- 








1279 1284-1286 


1288 


-1289 1299- 








1301 13 


06 


1308 


1312 


1314 1317- 








1319 1323- 


-1325 


1327 


-1330 1334- 








1335 1339 


1343- 


1347 


134 


9-1350 








1354-1355 


1357 


1360 


1362-1363 








1365-13 


67 


1369 


1372 


1376 1378- 








1380 13 


86 


1389- 


1391 


1394 1400 








1403 1406 


1409 


1416 


-1419 1422- 








1427 1429 


1435 


1437 


-143 


8 1440- 








1442 1446 


1448- 


1450 


1453 1460- 








1461 1468 


1470 


1472 


1474-1475 








1478 1482 


1486 


1490 


-1493 1496 








1498 15C0-1S04 


1506 


1508-1S09 








1511-1512 


1516 


1518 


-1519 1521 








1524-1528 


1531 


1536 


-153 


8 1543 








1547 15S0 


1554 


1556 


1564 1S67- 








1569 1580 


1587- 


1588 


1591-1592 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ IP NOS: 



1^98 
1628 
1646- 

1679 
•1692 
•1714 

1730- 

1748- 
•1764 

1779 



1600- 

1630- 

1649 

1664 

1683- 

1699 

1717 

1733 

1752 

1767 

1783- 



1601 

1631 

1652 

1667- 

1684 

1702 

1719 

1738 

1758 

1769 

1786 



1611- 
163S- 
1654- 
1669 
1686- 
1707 
1722 
1740 
1760- 
1772- 



159T 
1618 
1641 
1661 
1676 
1691 
1713 
1727 
1744 
1763 
1776 



1612 

163B 

1659 

1674 

1688 

1711 

1726- 

1743- 

1761 

1773 



fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



fetal liver 



Invitrogen 



FLV001 



15-l£ 26 "34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 S27 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 037 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 11B3 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



Clontech 



FLV002 



676 998 1719 



fetal liver 



Clontech 



FLV004 



93 133 214 301 355 3 
581 601 679 837 847 
1236 1270 1313 1324 
1355 1367 1425-1426 
1733 1760-1761 



74 379 555 
859 1123 
1325 1327 
1536 1690 



26 37-39 50-51 *B B4 
113 138 131-132 139 
194 198 201 206 211 
261 276 282 286 302 
375 379 383 398 412 
436 448 452 462-463 
519 529 561 569-570 
607 623 626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1060 1064 1070 



86 89 98 
155 172 186 
230-231 256 
325 359 361 
413 419 430 
473 477 503 
590-591 597 
660 672 715 
775-777 788 
915 921 935 
992 1000- 
1035-1036 
1083 1097 



fetal muscle 



Invitrogen 



FMS001 
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Tissue Origin 


RNA Source 








SEQ 


ID NOS: 








ijiULdxy rictiuc 




















1099 


-1102 


1116- 


1117 


1121 


1164 








1173 


1198 


1208 


1228 


1240 


1258 








1266 


1270 


1277 


1298 


1317 


-1320 








1324 


-1325 


1329 


1336- 


1337 


1369 








1383 


-1384 


1399- 


1400 


1403 


1409 








1433 


1505 


1514 


1542 


1551 


1554 








1557 


-1559 


1562 


1589 


1599 


1620 








1632 


1644 


1650 


1652 


1671 


1675 








1712 


1725- 


1726 


1743- 


1744 


1754 








1766 












fehal muscle 


Invitrogen 


FMS002 


119 


221 273 402 


426-427 


463 547 








599 


736 869 1000 1033 1083 1266 








1431 


1440- 


1441 


1468 


1545 


1599 








1673 


1678- 


1679 


1687-1688 


1710 








1712 


-1714 


1723 


1725 


1731 


-1733 








1743 


-1744 


1760- 


1761 


1767 






invi Li tiji 


c orvw J- 


1 4- 


11 15-16 20 


-23 25 29 


33 40 






43 4 


6 56-57 60- 


61 64-66 


75 82 87 








97-98 105 


107-108 113 118-119 








123 


133 135-137 


139 


144 


146 148 








151- 


153 156 163 


170 


176 


180 188- 








189 


197-198 200 


202-203 


210 218 








222 


231 246-247 


261 


263 


265-270 








277 


285-286 290 


293 


299 


301 307 








311 


321 325 328 


330 


333- 


335 339 








341 


345 351-352 


355- 


•356 


358-359 








3 62 


368 370 372 


376 


379- 


382 384 








388 


394 404-405 


408- 


.409 


411-412 








419- 


420 424 426 


-427 


436 


441-442 








445 


448-449 454 


462 


465- 


466 472 








476 


490 493 504 


506 


S09 


515-517 








519 


526 531 537 


-540 


547 


549 560- 








561 


567 572-573 


581 


584 


589 611- 








612 


615 623 630 


-631 


635 


647 649 








651 


657-658 660 


662- 


-665 


667 669 








672 


676 678 681 


688 


701 


704-705 








709- 


710 713 717 


720 


-721 


725-726 








728- 


729 732 748 


750 


753 


759 764 








766 


770 775-777 


780-781 


tOO (DO 








789 


798 809 811 


814 


816- 


817 822 








824- 


826 831 842 


857 


859 


861 863- 








864 


881 894-895 


908 


910- 


911 916 








918 


922-923 928 


932- 


-933 


935 937 








94 6 


948-949 953 


960-961 


966-967 








970 


975 977 986 


990 


992- 


993 999- 








1000 


1004 


1007 


1013 


1018 


1025 








1027 


1032 


1035 


1041 


-1043 


1054 








1057 


-1058 


1060 


1062 


-1064 


1069 








1072 


1077 


1090- 


1091 


1097 


1099- 








1103 


1108 


1113 


1119 


1123 


1128 








1131 


1134 


1140 


1148 


-1149 


1152- 








1153 


1156 


1163 


1167 


1178 


1182 








1189 


1192 


1195- 


1196 


119B 


1201- 








1205 


1208 


1211- 


1212 


1216 


1219- 








1220 


1222 


1225 


1240 


1243 


1258 








1266 


-1267 


1274 


1277 


1280 


1282- 








1285 


1299 


1310 


1317 


-1322 


1324- 








1325 


1329 


-1330 


1342 


1344 


1346 








1349 


-1351 


1354- 


1357 


1365 


-1366 








1369 


1371 


1373 


1376 


1378 


1380 








1383 


-1384 


1387 


1399 


-1400 


140S 








1410 


1427 


1429 


1431 


1433 


-1435 








1439 


-1441 


1448- 


1449 


1454 


1457 








1468 


1470 


1472 


1475 


1480 


-1481 








1487 


1490 


-1491 


1493 


1498 


1509 








1512 


1521 


1525- 


1526 


1529 


1535- 








1536 


1547 


1549 


1557-1559 


1588 








1592 


1595 


1597- 


1598 


1601 


1603- 








1604 


1608 


1611 


1614 


1618 


1624- 



127 



WO 01/53312 PCT/US00/34263 



Tissue origin 



RNA Source 



Hyseq 
Library Name 



SE<$ lb NOS: 



1644 

1665 

1702- 

1724 

1742 

1765 

1786 



U5T 
1646 
1668 
1703 
1727 
1747 
1772 



163T 
1654- 
1675 
1709- 
1731- 
1749 
1776- 



1636- 
1657 
1685 
1710 
1732 
1755 
1777 



164T 

1660- 

1687- 

1716 

1737- 

1760- 

1779- 



164T- 

1662 

1689 

1719 

1740 

1761 

1780 



fetal skin 



Invitrogen 



FSK002 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 426-427 433 4.36 4S0 454 
515 544 585 598 767 810 845 939 
1076 1109 1155 1317-1320 1326 
1333-1335 1343 1347 1350 1369- 
1371 1377-1378 1391 1397 1422 
1466 1647 1656 1678-1679 1687- 
1688 1693 1718 1721 1725 1731- 
1732 1739 1755 



fetal spleen 
umbilical cord' 



BioChain 
BioChain 



FSP001 



110 137 211 353 *89 927 1108 
1639 1771 



FUC001 



fetal brain 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 166 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
S37-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 B94-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 10S9 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 14541455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



GIBCO 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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Tissue Origin 



RNA Source 



Hyseq 
library Name 



SEQ ID NOS: 



72 75 77 80 82 85 90-91 94 100- 
102 107 110 112-116 118-119 122- 
123 126 128 134 136-140 147-148 
153-155 1S7 161 165 169-172 175 
181 186 188-189 197-198 204-206 
208 210 215 222-223 225-226 230 
235-238 240-241 247 253 256-258 
260-262 267-269 276 279-281 284 
286 289 298 300-302 307 310 318 
321-323 32S 330-331 339 341 346- 
349 352 354 356-359 362 364-365 
371-372 377 379-380 382 384 387 
390 400 408 414-416 419 424 431 
434-43S 438 441-443 449 451 453- 
455 457-463 470 472-473 475 477- 
478 482-483 486-488 490-491 493 
496 499-500 502-504 506-507 509- 
512 516 519-520 522 525-526 529- 
530 537-540 543-544 546-547 566- 
567 569-570 572-582 585 588 590- 
591 593 595 599 601 604 606-609 
611-612 614-620 622-624 630 632 
636 643 645-647 650-652 654 659 
661 665 667-668 670-672 676 678 
681 687 689 692-694 697 699 710 
714 717 721 727 729-732 734 736 
738 743-746 750-751 759 763 766 
770 772 775-777 784 789 791 796 
799 802-805 810-811 814 819-821 
824 826 830 834-837 839-850 854- 
856 858-860 862 864 669 871 876- 
877 879 883 885-887 890-891 893- 
Q95 898-901 905 908-910 912-916 
919 922-923 925 927 930-933 935- 
938 948 952-960 963-964 967 969- 
972 975 978-979 981 983 986-9B7 
990 992 995 997 999-1002 1005- 
1009 1011-1013 1016 101S-1019 
1023 1026 1029-1031 1033-1035 
1038 1041 1047 1050 1053 1057 
1059 1064 1068 1070 1072-1073 
1078-1079 1081-1082 1086 1089 
1094 1097 1103 1107-1109 1113- 
1115 1121-1122 1127 1134-1135 
1138 1140 1143 1148-1151 1153 
1156-1157 1159 1167 1170 1175 
1193-1194 1200 1202 1207-1209 
1211 1216 1219-1220 1226-1227 
1229 1232-1234 1240-1241 1243 
1246 1249-1251 1253 -1254 ' 1258 
1267-1268 1271 1276 1279 1282 
1285-1289 1293-1294 1305 1307- 
1308 1312 1316 1320 1327 1338- 
1339 1341-1344 1346 1349 1355- 
1357 1359 1365-1366 1369-1370 
1373-1375 1379 1386 1389 1394 
1398 1409 1413-1414 1416-1417 
1420-1421 1425-1427 1430 1433 
1437 1439 1442 1445-1452 1454- 
1457 1459 1463-1464 1468 1470 
1474 1477-1479 1489 1492 1494 
1497-1498 1501-1503 1507 1509 
1511-1513 1517 1520-1521 1524- 
1526 1531-1533 1535 1537-1538 
1547 1554 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1599- 
1601 1611-1612 1614-1616 1619- 
1620 1625-1628 1630-1631 1634 
1637-1639 1640-1643 1645 1648- 
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Tissue Origin 



macrophage 



infant brain 



RNA Source 



Invitrogen 



Hyseq 
Library Name 



HMP001 



SEQ ID NOS: 



1649 1651 1653-1655 1657-1658 
1664-1665 1667 1669 1673 1678- 
1679 1683-1684 1686 1693 1701 
1704-1705 1709 1713-1714 1717- 
1720 1724 1727-1728 1731-1733 
1737-1738 1743-1744 1752 17S4- 
1755 17S7 1760-1761 1765 1772 
1779 1785 



5-8 110 204-205 503 634 678 859 
878 933 988-989 1379 1448 1504 



Columbia 
University 



IB200T 



10 12-13 15-18 22-23 25 29 34 

37-39 43 47 50-51 54-56 58 60-63 
65-66 68-69 72-74 80 82-83 86 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 12B 130 
134-136 138-139 143 147-149 151- 
152 154-155 163 165-167 169 172- 
175 181-184 186 193-196 198 201 
203-205 209-210 214-215 222 224- 
226 231-232 235-236 239 246-247 
252 257 260 268-269 272 276-277 
279-281 286 288 291-292 295 298 
300-301 304 307 310 313 321-323 
330-331 333-334 339 346-347 349 
352 356-357 362 371-372 377 379- 
380 383-384 392 397 401 406 408 
411 413-414 416 418-419 422 428 
430-431 434-435 438 443 449 453- 
454 461 464-466 469-470 472-473 
475-476 478 482-483 487 490 492 
494 497 503 507-508 510-513 516 
519-520 524-526 530-534 536-540 
547 550-551 561 563-564 566-567 
572-576 579 581-582 584-507 590- 
591 593 595-597 607-609 611-613 
616-617 620 622-624 627 631 637 
641 645-647 650-655 657-658 660- 
665 667-675 689 691 695 697 699 
703 707 713-715 717 721 728-731 
733-736 739 743 745 751 755 759 
763 769-770 772 778 780-781 785 
788-789 793-794 799 803 BOB 811 
814 825-826 830 834-836 840-843 
845 848-850 854-855 860 862 864- 
865 870 872 875-876 878 886 888 
890-891 894-896 898 903-904 916- 
917 919 922-925 927-928 930-932 
934-936 938 941 945-946 948-950 
953-954 959-962 966-969 977 979 
981 906-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-1082 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1164 1188 
1190 1193-1194 1196-1197 1199 
1204 1208-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1256 1258 1261- 
1262 1269 1274 1279 1281 1283 
1285 1287-1289 1294-1295 1305 
1307 1313-1314 1316-1320 1329 
1332 1341-1342 1345 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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Tissue Origin | RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1423 
1441 
1454- 
1468 
1483 
1499 
1522- 
1542 
1555 
1580 
1593 
1610 
1624 
1639- 
1654- 
1672- 
1693- 
1717- 
1733 
1755- 
1777 



1429 
1443 
•1455 
1470^ 
1485 
1502- 
•1523 
1546« 
1563 
15B3- 
1595 
1612 
1626- 
1640 
1655 
1673 
1695 
1720 
1735- 
1758 
1778 



-1431 
1447 
1457 
-1471 
1493 
1503 
1525 
•1547 
1565- 
•1586 
1598 
1614- 
1627 
1642 
1658- 
1676- 
1701- 
1723- 
1741 
1762 
1786 



1435 
-1449 
1459 
1475 
-1494 
1505- 
1528 
1549- 
1567 
1588 
1600- 
1616 
1630- 
1644 
1659 
1681 
1702 
1724 
1743- 
1765 



^1436 
1451 
1463 
1479 
1496 
■1507 
1531- 
•1550 
1569 
1590 
1601 
1619 
1633 
1647 
1664- 
1685- 
1704 
1726- 
1744 
1771 



1439- 
14S2 
-1465 
1482- 
1490- 
1509 
•1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 
1665 
1688 
1708 
1728 
1752 
1774 



Columbia 
University 



IB2003 



inJcant brain - 



Columbia 
University 



IBM002 



Infant brain 



Columbia 
University 



IBS001 



17-18 20-23 29 34 43 60 68-69" 
78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-261 286 290-292 295-300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-41S 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540 551 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
675 678 609 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 8B9 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-1289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 1557- 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 17 78 

101 113 139 152 26"0 279 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



1U 12 119 175 279-281 321 334 

371 446 551 563 623 652 667 669 
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Tissue Origin 



RNA Source 



Hyseq 
Library 



SEQ ID NOS: 



£71-472 819 949 966 
1151 1188 1193-1194 
1258 1265 1271 1287 
1324-1325 1342 1423 
1448 1471 1482 1525 
1562 1569 1588 1591 
1647 1649 1658 



1113 1130 
1196 1229 
1317-1319 
1440-1441 
1532 1546 
1610 1618 



lung, 
fibroblast 



Strategene 



X.FB001 



S-9 17 20-21 25 
153 157 197-198 
213 223 262 26C 
333 356 370 427 
472 493 498 503 
537-540 542-544 
599-600 607 615 
692-694 712 719 
794-796 810 837 
856 869 876 903 
964 975-976 984 
1024-1025 1033 
1070 1072 1082 
1136-1138 1140 
1233 1246 1279 
1320 1334-1335 
1446 1478 -482 
1552 1555 1567 
1620 1625 1632 
1655 1662 1680 
1690 1696 1702 
1760-1761 1778 



68-69 82 94 105 
203 207-208 212- 
233 302 321 326 
430 436 446 462 
516 519 527 535 
562-565 5S7 586 
630 647 662-664 
745 748 775-777 
843-847 849 854- 
934 953 955-956 
1000 1005-1007 
1039 1053 1064 
1112-1113 1134 
1195 1223 1232- 
1285 1295 1311 
1343 1427-1428 
1493 1504 1537 
1575 1582 1598 
1638 1645 1654- 
1681 1684 1686 
1711 1733 1741 
1785 



lung tumor 



Invitrogen 



LGT002 



194 196 
216-217 
246 251-252 



■214 
-241 244 



5-10 18 20-21 29 33-36 40 43 52 
54-55 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-119 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-156 159 161 
164 169 171 179-180 185 190 193 
199 203-208 210 212- 
219 222 233 240 

255-256 261-262 266. 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-381 384 389-390 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-488 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 S72-576 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-633 644 647 649 651 655-656 
660 662-665 667 669 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 7S0 752 
763 765-766 773-778 784-785 787- 
789 791 800 802-803 809-812 814 
824 826 628-829 832 838-839 841- 
845 849-850 852-855 857-861 864 
866 874 878-880 882 887 890-891 
897-898 902 904 906-907 910 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 9S5-956 
961 963 966-967 969 971 977-979 
981 984 986-987 990 992-993 995 
997 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










1045 


1047 


-1050 


1052 


1054 


-1055 








1059 


1063 


-1064 


1067 


- 1071 


1073- 








1074 


1070 


1085 


1087 


1089 


1095- 








1097 


1104 


1106- 


-1107 


1109 


1112 








1116 


-1117 


1119 


1126 


1134 


-1135 








1139 


1141 


-1142 


1144 


-1145 


1148 








1152 


-1153 


1156- 


-1158 


1167 


1170 








1172 


1178 


1195-1196 


1198 


-1200 








1202 


1204 


1208 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 


1252 








1257 


-1258 


1265 


1267 


-1270 


1276 








1278 


1280 


-1281 


1283 


1285 


1288- 








1269 


1295 


1300 


1305 


1308 


1312 








1317 


-1321 


1329 


1338 


- 1339 


1341 








1344 


-1346 


1349- 


1351 


1353 


-1355 








1357 


1365 


-1366 


1369 


1378 


-1379 








1383 


-1385 


1394 


1397 


1400 


1402- 








1403 


1408 


1417 


1419 




-1426 








1431 


1433 


-1436 


1438 


1444 


1446- 








1448 


14S4 


-1455 


1460 




1468 








1470 


1474 


1480- 


1481 


At* O J 


1486- 








1488 


1490 


•1491 


1494 




1506 








1508 


-1509 


1511- 


1512 




-1516 








1519 


1523-1524 


1528 


-1529 


1536- 








1540 


1546 


1549- 


1550 


1555 


1560- 








1561 


1565 


1567 


1569 


1575 


1588 








1591 


1593-1594 


1596 


-1598 


1600- 








1602 


1608 


1614- 


1616 


1618 


1620 








1624 


•1625 


1627- 


1632 


1636 


1639 








1644 


-1645 


1647- 


1649 


1652 


-1653 








1656 


-1662 


1664 


1666 


-1657 


1670- 








1671 


1673-1675 


1678 


-1679 


1683 








1685-1688 


1690- 


1692 


1696 


-1699 








1705 


1709 


1716- 


1717 


1722 


1727 








1730 


1735 


1739 


1741 


1743- 


-1744 








1748 


-1749 


1753 


1760-1762 


1765 








1767 


1770- 


•1771 


1773 


1775-1776 








1778-1779 


1786 








lymphocytes 


ATCC 


LPCOOl 


4 11- 


-12 18 24-25 30-31 48 50-51 








56-57 68-69 80 


92 98 103 


105 110 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217- 


-218 222-223 








225-226 229 231 


247 


251 256 264 








272 280-281 284 


300-301 321 325- 








326 339 34 


8 352 


357 


371 382 3B4 








390 400 404 412 


414 


421 423 426- 








427 430-431 445 


447- 


448 451 454- 








455 4 


75 503 516 


S26- 


527 530 537- 








540 549 556-560 


563 


574 577 589 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657- 


659 690 697 








717 723 755 764 


775- 


777 780 786 








789-790 793 800 


602 


822 838 849 








866 869 876 881 


-883 


892 898 906- 








907 911 921-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 


1078 


1107 1135 


1140- 


1141 








1143 


1148 


1158 1163 


1177 


1199 








1205 


1216 


1226 1231 


1236 


1241 








1244 


1250 


1258 1260 


1265 


1269- 








1271 


1290- 


1293 1308 


1312 


1317 








1319- 


1320 


1339 1345- 


1346 


1348 








1350- 


1351 


1357 1367 


1369 


1379 








1381 


1383- 


1384 1386- 


1387 


1389 








1394 


1397 


1405 1423 


1425- 


1428 








1431 


1437 


1446 1448 


1461 


1466 








14 70 


1472 


1474 1482 


1492 


1506 








1528 


1537 


1546 1549 


1591 


1598 








1600 


1603- 


1604 1606 


1627 


1636 
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Tissue Origin 


RNA Source 


— i 

Hyseq 






SEQ ID ftOS: 








Libcary Name 




















1638 


164 


7-1649 1 


€5*1 


1^58 


-1659 








1664 


1676-1677 1 


680- 


1681 


1687- 








1680 


1699 1711 1715- 


1716 


1726 








1728 


1737 1740 1746 


1748 


1752 








1756 


1758 1777 1779 






leukocyte 


GXBCO 


jULUU J. 


3-4 


10-11 13 15-18 20-21 


24-25 






30-3 


1 35 


-36 40 43-45 


48 


50-51 








54-58 60 


-63 68-69 75 


79- 


80 82-83 








85 6 


8-91 


93-96 90 100 103-104 








107- 


108 


112 116 


119 


123 


125-128 








134- 


140 


142 147-149 


151 


153 155 








157 


162- 


163 167 


169- 


172 


174 177- 








179 


186 


190 192- 


-199 


203- 


207 210 








212- 


215 


217-219 


222- 


223 


229 235- 








236 


247 


251 255-258 


260 


262 272 








274- 


277 


280-281 


285- 


286 


297-301 








307- 


310 


313-314 


316- 


317 


321 325- 








330 


333- 


334 340-342 


348- 


349 352 








354- 


358 


370-371 


380- 


385 


387-388 








400 


405 


408-410 


412 


414- 


416 421- 








425 


430- 


•431 434 


-435 


437 


439 441- 








442 


445- 


-451 453 


-454 


456 


459 461- 








464 


468- 


-472 474 


-479 


481 


483-485 








487-491 


496 499 


-501 


503-504 509- 








513 


516- 


-519 522 


526- 


527 


529-531 








534 


536-540 542 


547- 


•549 


553-559 








566-567 


571 574 


-577 


579 


582 584- 








586 


589 


593 595 


-597 


601-602 604 








606-607 


611-613 


615-621 


623 627- 








629 


633 


636-637 


642 


644- 


-650 655 








659- 


-660 


662-665 


667 


669 


674-675 








678 


682-684 692 


-696 


698 


700 706 








708 


710 


716-720 


725- 


-726 


729-736 








738- 


-739 


743-746 


749 


751 


753 756 








759 


765 


-766 768 


770-773 


780 784- 








786 


788 


-790 793 


796 


793 


800 802- 




* 




803 


810 


-811 814 


817 


819 


826 828- 








830 


832 


834-836 


838 


843 


845-860 








863 


-864 


866-871 


877 


-879 


881-892 








894 


-896 


698 902 


904-914 


916 919- 








925 


927 


930-932 


935 


-936 


941-942 








945 


948 


-949 953 


955-956 


958 960- 








962 


964 


967 970 


-971 


973 


975 977 








985 


-990 


992-993 


995 


-996 


999-1002 








1004-1009 1011 


1014 


1017-1019 








1022-1023 1025 


1027 


1029-1031 








1033-1036 1038 


1041 


1043 1047 








1050 1053-1054 


1058 


-1059 1061- 








1062 1064 1068 


1070 


1072 1078 








1085-1086 1089- 


1091 


1093 1097 








1106-1107 1110- 


1113 


1115-1117 








1122-1123 1125 


1129 


1132-1133 








1135-1137 1140- 


1145 


1152 1158 








1163 1168 1170- 


1174 


1176-1178 








1180 1182-1183 


1186 


1195 1198- 








1200 1202 1205- 


1206 


1211 1216 








1219-1221 1223- 


1227 


123 


0-1236 








1238-1242 1247 


1252 


1254 1256 








1258 1261-1262 


1264 


-1265 1269- 








1270 1272-1275 


1277 


1280-1284 








12B7-1293 1299- 


1300 


1306 1308 








1312-13 


13 1317- 


1320 


1322 1324- 








1330 13 


33-1335 


1339 


134 


1 1343- 








1347 13 


49 1353- 


1357 


1359-1361 








1365-1367 1369- 


1370 


1373-1374 








1377 1379-1381 


1386 


-138 


7 1394 








1400 14 


03 1409 


1419 


1423 1425- 








1428 1430-1431 


1433 


-1434 1437- 








143 


8 1440-1442 


1446 


-144 


8 1450 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



1453 

1470- 

1488 

1506 

1521- 

1531 

1549- 

1565 

1594 

1608 

1626- 

1639 

1653- 

1670 

1692 

1711 

1727 

1744 

1762 

1784 



X4ST- 

1471 

1490- 

1509 

1522 

1534 

1550 

1567 

1596 

1611 

1629 

1641 

1655 

1675' 

1696 

1716 

1733 

1748 

1765 

1786 



1459 
1474 
1493 
1512- 
1524- 
153B 
1553 
1575 
1598 
1614 
1631- 
1644- 
1658- 
1679 
1700 
1717 
1737- 
•1749 
1769 



1463- 

1477 

1496- 

1513 

1525 

1541 

1555- 

1580 

1600- 

1620- 

1632 

1645 

1660 

1684- 

1702 

1720 

1738 

1752 

1771 



1464 

1478 

1501 

1516 

1527- 

1545- 

1556 

1589 

1602 

1621 

1636 

1648- 

1662 

1688 

1707- 

1723 

1741 

1755 

1772 



1468 

1482- 

1504 

1519 

1528 

1547 

1560 

1591 

1606- 

1624 

1638- 

1650 

1669- 

1690- 

1709 

1725- 

1743- 

1760- 

1781- 



4 *5-36 44-4^ 61 68- ^9 7$ 82 102 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621- 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1113 1119 1170 



leukocyte 



Clontech 



LUC003 



1236-1237 1241 1275 
1357 1359 1377 1506 
1553 1591 1600 1613 
1628 1670 1676-1677 
1699 1733 1738 1772 



244 280-281 
455 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



150 



mammary gland 



Invitrogen 



MMG001 



25 35-3£ 43 80 104 126 128 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 759 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1008 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 17S0 1760- 
1761 



5-8 10 12 14 
33-39 42-43 
71 73-74 79- 
106 108 112 
146 148 150- 
166 170-172 
188-190 194 
222 224 227 
251 253-254 
271 276-277 



-18 20-21 24 
52 55-58 60- 
80 82 89 98 
123 128 133- 
152 154 158- 
174 176 178 
198 201-206 
228 231 233- 
256 261-263 
279-281 284- 



-25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



290 297 299 301 304 
320-321 323-325 327 
334 339 341 344-34S 
359-350 362-363 368 
303 3S0 390 393-395 
4C8 412 414-415 423 
441-444 448 451-455 
476 479 482 485-486 
495 498 503 506 509- 
519-520 522 527 529 
.547 549 554 557 562 
589-591 597 602 607 
629 632 634-640 644 
652 655 657-658 660 
672 674-676 679 682 
706-707 710 713 717 
732-734 736 738 743 
755 759 761 766 770 
789 794 003 806-807 
B22 827-829 837 842 
■864 866 869-870 872 
893-900 904 906-907 
921-923 926 935-937 
953-954 957 960-961 
970 977-978 984-989 
1000-1001 1O05-1OC6 
1014 1016-1017 1023 
1032-1033 1036 1039 
1055 1057-1058 1063 
1077-1078 1085 1087 
1095-1102 1107-1108 
1121-1123 1131-1133 
1139-1142 1144-1145 
1153 1159 1167 1170 
1183-1185 1190-1192 
1207-1208 1212 1216 
1223 1225 1231 1234 
1247 1253-1254 1258 
1262 1270-1280 1283 
1298 1307 1314 1316 
1325 1330 1334-1335 
1349-1352 1354-1355 
1370 1377 1379 1381 
1389 1405 1414 1419 
1425-1426 1428-1429 
1437 1439 1448-1449 
1460-1464 1466 1471 
14B7 1489-1491 1493 
1512 1519 1526-1528 
1536 1539 1542 1547 
1554 1561-1562 1564 
1576-1579 1581-1582 
1592 1594 1596-1597 
1607-1608 1610 1612 
1621-1622 1625-1626 
1636 1641 1643-1644 
1652 1654-1655 1657 
1662 1664-1666 1669 
1674 1676-1677 1680 
1692 1701 1706 1713 
1720 1723-1728 1730 
1740 1742-1744 1746 
1751 1753 1760-1762 
1771 1774 1776-1777 
1784 1786 



706- 
817- 
863- 



309-312 318 
329 331-332 
348 350 356 
371 376 379- 
397-398 405 
430 434-437 
462-464 474 
468 490 494- 
512 516-517 
534 537-541 
572-574 587 
618 623 628- 
647-648 650- 
665 667 669- 
688 695-696 
720 722-730 
747-748 750 
780 704 
809 814 
854-858 
878 881 889 
911 916 919 
946 948-949 
963 965-966 
993-997 
1008 1013- 
1025 1027 
1043 1045 
1068-1075 
1089-1091 
1112-1119 
1136-1137 
1148-1149 
1172-1173 
1196-1199 
-1218 1222- 
1240-1241 
-1259 1261- 

1285-1286 
-1320 1323- 
1342-1345 
1359 1369- 
1383-1384 
1421-1423 
1431 1434- 
1454 1457 
1480-1483 
1505 1507 
1532 1534 
1549-1550 
1567 1572 
1587-1588 
1601-1602 
-1616 1618 
1631 1635- 
1647 1650 
-1658 1660 

1671 1673- 
-1685 1689- 
1715 1719- 
-1732 1738 
-1747 1749 
1765-1768 
1779 1783- 



induced "neuron 
cells 



Strategene 



NTD001 



29 35-36 80 116 123 

214 230 280-281 284 

330 340 358 371 375 

422 424 492 497 532 



1S6 163 181 

-285 307 321 

377 380 382 

533 542 546 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



549 566 5 
734 775-7 
856 858 8 
1041-1043 
1194 12QG 
1288-1289 
1349 1359 
1623 1645 



86 595 61*2 
78 780 792 
75 936 9S3 
1055 1072 
1223 1246 
1291 1294 
1412 1423 
1684 1705 



"645-647 654 
799 821 826 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



retinoid acid 

induced 
neuronal cells 



Strategene 



NTR0 01 



neuronal cells 



Strategene 



NTU001 



5-8 78 268-269 277 383 431 506 " 
623 677 731 999-1000 1199 1425- 
1426 1547 



29 65-66 80 82 110 119 146 152 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 12B3 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



pituitary 
gland 



Clontech 



PIT0O4 



311 314 379 408 419 430 454 1055 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



placenta 



Clontech 



PLA003 



prostate 



Clontech 



PRT001 



5-8 124 208 277 370 843 906-907 
1200 1317-1319 1359 1609 1621 

1737 

9 46 57 71 107 147 171 177 197 



rectum 



Invitrogen 



RECOOT" 



201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 
17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 S89 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 730 748 750 
756 762-763 766 770 774 790 819 
825 843 849 8S1 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1139 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin. 


RNA Source 


Hvseo 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 1474 1477 








1482 1546 1587-158B 1592 1596 








1610 1622 1627 1644 1658 1662 








1665-1666 1669 1675-1677 1749 








1786 


salivary gland 


Clontech 


SAL001 


10 55 97 103 110 140 149 152 158 








198 217-7.18 242-243 256 301 308 








312 321 333 351 354 360 410 437 








448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








10S5 1066 1103 1150 1172 1181 








1234 1281-1282 1288-1289 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 1482 1492 1494 1498 1511 








1523-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 1658 1665 








1671-16/2 1691-1692 


salivary gland 


Clontech 


SALs 03 


158 326 1423 1463-1464 


skin 


ATCC 


SFB0C1 


1320 1400 


fibroblast 








skin 


ATCC " 


SFB002 


262 736 1025 1253 


f ibrobl as t 








skin ' 




SFB003 


"709 1119 1350 1631 1653 


f* ibrobl aah 














?5 14? 145-147 151 155 19R 


inhpnt 4 np 

JLUl>^£> *- lilt: 






944 260 271 2RO-2R1 2RS ?Rfl ? QH 

ill *DU *• ' JL <Ov~ ^ O JL iOo COO i jO 








3D1-302 30R 112 334 340 371 39fl 








40R 412 414 416 423 425-427 430 








434-435 445 452 454 d7B 5(13 515 








519 521 523 543 547 549 555 559 








563 S69-570 585 592 604 fill 525 








628-629 632 650 659 681 710 714 








718 750 764 780 79S 829 R42 R57 








859 866 887 892 894-895 901 904 








905-907 912 919 935 997-99R 1000 








1007-1008 1026-1G2R 1044 1055 








10R9 1097 1115-1117 1131 114R 

J. V O 7 1 v J ( Alio ' All » lUl lllO 








1159 1199 1219 1234 1247 1254 

J> lu 7 11 77 1^17 1 * J ^» 1»' / l£Q*t 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


skeletal 


Clontech 


SKM001 


18 20-21 82 84 101 118 134 148 


muscle 






151 153 166 225-226 258 274 277 








289 329 361 412 414 424 440 452 








459 470 486 503-504 S37-540 647 








660 673-675 71S 773 780 786 830 








905 922 950 963 982 990 952 1020 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


1*8 Z.6B3 1712 


muscle 








skeletal 


Clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKM804 


235-23^ ' " 


muscle 








spinal cord 


Clontech 


spc66i 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Orxgm 



adult spleen 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300 
317 372 379 387 392 
430 433 448 467 473 
509 513 519 524 526 
547 549 551. 559 567 
607 616-617 623 625 
6S2 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871- 
898 906-908 917 919 
944 970 985 990 992- 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
-302 304 315 
419 426-427 
487 489 506 
537-S40 543 
569-570 5S3 
637 649-650 
673 679 6ei- 
728-729 734 
781 709 791 
847-849 854- 
-872 875 884 
924 934 942 
•993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



CI ontech 



SPbcOl 



stomach 



CI ontech 



STO001 



117 312 326 348 424 426-427 431 
845 666 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



thalamus 



cl ontech 



thymus 



Clontech 



10 15-16 61 68-69 100 117 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 6S1 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



THA002 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 702 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1686 1703 1743-1744 1746-1747 
1753 



THM001 



44-45 54 57-58 62-64 79 104 123 
126 134 153 193 212-213 218 242- 
243 258 274 277 279 297 301 307 
327 330 333 342 351 358 371 410 
430 445 465-466 468 471 483 487 
493 503 506 509 517 526 535 537- 
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Tissue Origin 



RWA Source 



Hyseq 
Library Name 



SEQ ID NOS~: 



540 546 548 554 567 584 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 6SS 670 698 710 720 
728 735 739 746 759 762 766-767 
775-777 780 784-785 BOO 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1536 1554 1620 1644 164G 
1549 1654-1655 1661-16S2 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



50-51 54-55 60 75 83 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
Zll 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 48B 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 S73-67S 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-B91 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 1545 1549 
1566 1594 1596-1600 1606 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 



thymus 



CI on tech 



THMCQ2 



5-9 15-21 25 33 35-36" 43-45 48 
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Tissue Origin 



thyroid gland 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Clontech 



THRO 01 



trachea 



Clontech 



"TRC001" 



q 9-10 20-21 37-39 48 50-51 S4- 
57 SO-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155-158 163-164 1G8-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 2S6 258 
262 265-266 268-269 277 280-281 
284-286 288-289 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 S64 569-S70 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 B04 B23- 
824 826 828 833 838 841-845 847 
849 857-860 867 874-875 878 B8C- 
881 887-888 890-892 894-895 898 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 953 957 961 963-964 966 978- 
979 981-982 987 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1055 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 11S6 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 1428*1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 1553 1555- 
1559 1562 1567 1578 1S90-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 17B6 
9 29-31 46 48 87 104 107 110 135 
158 222 262 266 286 301 318 331 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










352 


372 


377 384 414 


424 


445-446 








454 


472 


474 491 496 


560 


579 588 








593 


597 


607 612 626 


681 


702 719 








610 


859 


866 878 894 


-895 


912 916 








522 


932 


935 1046 1075 1080 1099- 








1102 


1113 1208 


1215 


1232 


-1233 








1237 


1281 1312 


1385 


1387 


1405 








1414 


1424 1430 


1437 


1447 


1505 








1569 


1579 1586 


1600 


1641 


1653 








1667 


1671 1676- 


1677 


1683 


1691- 








1692 


1711 1717 


1726 


1772 




uterus 


CI on tech 


UTR001 


17 19 25 41 46 


57-58 61 


B9 104 








108 


139 


152 174 


198 


200- 


201 206 








263- 


265 


274 290 


387 


408 


420 438 








446 


448 


452 473 


491 


493 


499 503 








506 


513 


519 522 


526 


530 


542-543 








560 


601 


610 632 


659 


665 


720 751 








773 


780 


833 845 


857 


872 


877 912 








929 


934 


937 996 


1009-1011 1018 








1050 


1075 1107 


1124 


1170 


1219 








1258 


1279 1287 


1310 


1320 


1323 








1343 


-134 


4 1375 


1437 


1451 


-1452 








1478 


1481 1498 


1519 


1521 


1536 








1552 


1579 1597 


1602 


1606 


1620 








1626 


-1627 1649 


1652 


1661 


1670 








1719 


1722-1723 









TRADOCS:141619l.l(y o CQN0l!.DOC) 
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TABLE 2 



SEQ 
ID 
NO ; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 1 


SMITH- 
WATERMAN 
SCORE 


* 1 
IDENTITY 


1 


" VA 1 TiC 

Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence . 


1398 


100 


c 


iboDbo 


Homo 
sapiens 


Membrane -bound protein 
PR0943. 


2389 


99 


-g ■ — 




Homo sapiens 


II*- 1 receptor- associated- 
kinase-M; IRAK-M 


3043 


100 




Ul /DUD 


Mus mus cuius 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


fibronectir. precursor 


1053S 


98 


6 


X02761 


Homo sapiens 


fibronectir. precursor 


B990 


89 


B 


X02761 


Homo sapiens 


tibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 ~ 


Homo sapiens 


Human stomach carcinoma clone 
HP104 IS -encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3 r 
node of Ranvier (ankyrin 
<3) ) ) 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


16 


AF233453 


Homo sapiens 


RACK- like protein PRJCCBPl 


3124 


99 


17 


AF201303 


Homo sapiens 


dhfr oribeta- binding protein 
RIP60 


3130 


98 


18 


AF064205 


Homo sapiens 


dynactin l piSO isoform 


6377 


100 


19 


U00059 


Saccharomyce 
s cerevieiae 


Yhrl21vp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanosine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+/calmodul in- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140507 


Homo sapiens 


Ca2+/calmodulin-dependent 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulfotransf erase 


2211 " 


"99 " 


~T5 


U33460 


Homo 
sapiens 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


U43 701 


Homo sapiens 


rlbosomal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


7? 7 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


99 




W71749 


Homo sapiens 


Human ubiguitin conjugation 
system protein 2. 


715 


90 


31 


W71749 - 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


*31 


82 


32 


AF231917 


Homo sapiens 


long-chain 2-hydroxy acid 
oxidase HA0X2 


1811 


100 


33 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 
dioxygenase 


1507 


99 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 






precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


no4 ' ■ 


98 


'3? " 


Y78795 


Homo sapiens 


Human antiauai-2 (AZ-2) amino 
acid sequence . 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


4726 


99 ■ 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION - 
NUMBER 


j SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M -phase phosphoprotein- l 


3747 


100 


41 


Y4 27S0 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) . 


795 


100 


42 


AP282626 


Homo sapiens 


latexin 


1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6231. 


384 


94 


44 


U19617 


Mus ctrusculus 


Elt-1 " " 


2724 


88 


45 


U19617 


Mus mu3culus 


El£-1 


2062 


86 ~ " 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-1 protein, SEQ 
ID NO:24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 ■ 


X63547 


Homo sapiens 


oncogene 


5045 


99 


52 


M94043 


Rattus 
norvegicus 


rab-related GTP -binding 
protein 


1089 


96 


53 


L317B3 


Mus musculus 


uridine kinase 


917 * " - * 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF2 24741 


Homo sapiens 


chloride channel protein 7 


4128 


99 




W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma tium vinosum. 


60B9- 


99 


59 

To "■' ] 


D79994 


Homo sapiens 


similar co ankyrin of 
Chromatium vinosum. 


4014 


91 




Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15. 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing CXXC 
domain 1 


1390 


1O0 


62 


Y^6^0 ■ 


Homo 
sapiens 


Membrane -bound protein 
PR0783. 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783. 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF13^51B 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


to2$$66 


Homo sapiens 


Homo sapiens DH1308_1 clone 
secreted protein. 


157 


30 


67 


AJ245738 


Homo sapiens 


claudin-15 


1206 


100 


G8 


AF09913 8 


Rattus 
norvegicus 


GLUT 4 vesicle protein 


41B3 


87 


69 


AF099138 


Rattus 
norvegicus 


GLUT 4 vesicle protein 


4906 


86 


70 


282059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPAi protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 " 


Y41652 


Homo 
sapiens 


Human MEK2 protein seguence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- 1 


1465 


74 




AE000406™ 


Escherichia 
coli 


putative DKA topoisomerase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 
romyces 1 
pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


79 


AF129756 


Homo sapiens 


G4 


1554 


99 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


80 


zit nocico 
*U-iuyb / bo 


Homo sapiens 


dJ858B16.2 
(phosphatidylserine 
aecarooxyiase IPSSC, EC 

ATT £(H \ 


2033 


100 


81 


AL096768 


Homo sapiens 


dJ858Bl6.2 
(phospha t idyl seri ne 

ucuaiuuAyiaoc \ roo t. r lit. 

4 .1 .1.65) ) 


1220 


96 


82 


XS7351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 






2700 


98 


84 


X73113 


Homo sapiens 


fast MyBP-C 


5959 


99 


85 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; 

l~LilC4 


1305 


99 


86 


AB01B423 


M\1C mil q/*ii1 tie 


SH2 domain- containing protein 


1360 


78 


87 


AF272151 




adaptor protein CIKS 


3084 


99 


8B 


AP196329 


sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


thaliana 


contains similarity to pre- 

mRNA splicing 

£ actor ~gene_id : MRB17 . 2 


634 


36 


90 


AJ133721 


I 'US IllUoClilUs 


homeodomain protein 


£S4 


57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


61 


92 


ACT 971 " 


unidentified 


MCSP 


11676 


99 


93 


Y9936S 


Homo sapiens 


Human PR01250 (UNQ633) amino 
acid sequence SEQ ID NO: 86. 


3890 


100 


94 


ID 1 &$\ 


Homo sapiens 
■ 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8 . 


1031 


100 


95 


t\C f fH±. 


Rat tus 
norvegicus 


protein kinase WNKl 


2428 


95 


96 




Ra ttus 
norvegicus 


protein kinase WNKl 


1961 


"94 


97 


Y92513 


Homo sapiens 


Human OXRE-10 . 


1626 


100 


98 


AL.0 9 1 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC00S733 


Homo sapiens 


R33083 1 


1974' ■ 


99 


100 




Homo sapiens 


Human GEF containing NEK- like 


4092 


99 








kinase substrate sGNK. 




101 


ftiJil ujvX 


Homo sapiens 


dJH9iNi6.l (A novel protein 
{translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


1509 


100 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein aupi 


2042 


96 


104 


A6015982 


Homo sapiens 


serine/ threonine kinase 


4718 


100 


205 


AF151074 




tiorUd 4 0 


831 


64 


106 

"107 ' 


M35522 


Canis 
familiar is 


GTP-bmdmg protein <rab7) 


354 


50 




"R99800 


Homo sapiens 


NT I I - l nerve protein/ 
facilitates regeneration of 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH- cytochrome bs reductase 

isoEnrm 


1290 


93 


109 
110 


AC005614 
AF064729 


Homo sapiens " 


F23269 2 

RAN binding protein 16 


3369 


99 


111 
112 


X52425 


Homo sapiens 


interleukin 4 receptor 


3285 
4496 


100 
100 




Y41686 


sapiens 


Human PR0274 protein 
sequence . 


2285 


100 


113 

"TH 


W1SS06 1 


Homo sapiens 


Mitogen activating protein j 
kinase ERK1. 


1991 


t nn 
xuu 




Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 

Tie 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2 ) 


3497 


99 


117 


AF189817 
W30891 


Mus musculus 
Homo 


evectin-2 

Human cytostatin lit protein. | 


1124 
715 


90 
99 
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% 
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sapiens 








118 


*» L X W D dk O 




trtUJ JLUJo 


1469 


100 


119 


Y08915 


HV^n>rt cajmi and 

noiuo aapiclio 


alpha 4 protein 


1748 


100 


12C 


AF098070 


Drosophila 

me -t a izcjya s t e x 


Li si homolog 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p80 subunit 


181 


37 


122 




Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC02S 


2132 


100 


124 


/UHo 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


i otz 
i« j 


MbilUy 


Leishmania 
tna 3 ox 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogas fcer 


Atu ' 


93* 


36 


127 


u v u ^ ^ VJ 


Caenorhabdi t 
is eieyana 


similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo sapiens 


Human 2Sig4 4 protein. 


463 


100 


130 


AF1153 91 


Lactobacillu 
s sakei 


ribokinaoe RbsK 


508 


37 ' " 


A J J. 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21 -Glutamic Acid-Rich Protein 


"91(5 


87 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein . 


3230 


100 


135 


N69181 


Homo sapiens 


non-muscle myosin B 


189 


20 


136 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83 . 


480 


100 " ' 


1 1*7 
-L J / 


fat to^ftn 

W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81. 


855 


99 


13 B 


jit m^con 


Homo sapiens 


dJ349A12.1 (similar to 
KIAA070I protein) 


424 


39 


139 


■Ht VfiU<io 1 


S ant alum 
album 


proline rich protein 


119 


30 


140 1 


X70394 


Homo sapiens 


zinc finger protein 


1634 




141 




Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 


Z68493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


365 


42 


143 


AB0181O7 


Arabidopsis 
tihaliana 


ADP-ribosylation factor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 j 


"145 " 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480' ■ 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


F3F19.18 


647 


31 


14 B " 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


DPnkCii an 
fir U3 D * J? u 


Homo sapiens 


cAMP- specific 
phosphodiesterase 8A 


3 710 


99 


150 


Y58171 


Homo "~~ 
sapiens 


Human hydrolase homologue 
KHH-7 . 


785 


99 


i'5'x 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl48wp 


*1* 


•si" 


"151 


X73478 


Homo sapiens 


phosphotyroeyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens | 


dJ3 82110. 5.1 (novel protein 


2034 


99 
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similar to arginyl- tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


14 71 


100 


156 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor- 2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


ca rboxypepti das e M precursor 


2395 


100 


151 


WS4040 


Homo sapiens 


Human interferon-inducible 
protein, HIPI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ4l3H6.l.i (hamster 
Androgen-dependent Expressed 
protein like putativs 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homolog 


193 


45 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


463 


97 


165 


AJ250839 


Homo sapiens 


serine/threonine protein 
kinase 


1442 


71 


166 


L09649' ~ 


Zymomonas 
mobilie 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUXFC71. 


1084 


100 


169 " 


AF214731 


Homo sapiens 


ATP-dependent RNA helicase 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved prote"in 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205 


100 


176 


W90338 


Homo 
sapiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel- related 
molecule HCRM-3 . 


1122 


"ido 


178 


Y416 74 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B-chain precursor "j 


1240 


100 


181 


U57344 


Mus musculus 


Meis3 


1813 


89 


163 


U57344 


Mus musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


pS3 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 | 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187" 


W75058 


Homo sapiens 


Human secretecl protein 
encoded by gene 2 clone 
HLDBG33. 


1188 


S3 


188 


AJ292525 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


"Tso 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 




Y22203 


Homo sapiens 


Human calcium- binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12 . 


1975 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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* 
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194 


AE t 0842^9 


Mus mua cuius 


bromodoma in - c onta i n ing 
protein BP75 


693 


54 


195 


Y00752 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70 7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_l. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2 . 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 6142. 


558 


99 


203 


Xl388£ 


Nicotiana 
t aba cum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


206 


Y02B60 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


AL121889 


Homo sapiens 


dJ1076El7.l CKIAA0 823 protein 
(continues in AL023803)} 


694 


54 


210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus musculus 


Clq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


Z2932B 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3 933 


100 


216 


AF250£5& 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ82iDli.i (PUTATIVE protein) 


259 


100 


218 


Y08565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


219 


Y94452 ™ ■ 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035S21 


Arabidopsis 
thaliana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


"Bll 


41 


222 


AU.09736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 136 


23 




AL035659 


Homo sapiens 


dJ979Nl.i (dJ979Nl.l) 


5199 


98 


225 


AB032401 


Mus musculus 


mmDj4 


1761 


92 




AB032401 


Mus musculus 


mmO j 4 


1988 


92 




X83502 


s cerevisiae 


J1007 


112 


26 


228 


X83502 


Saccharomyce 
3 cerevisiae 


J1007 


79 


25 


229 


AF143723 




heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


982 


100 


"2JI 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 ' 


W00365 


Homo sapiens 


Human eye 1 in Bl . 


2218 


99 


234 " 


ysi7"62 


Homo sapiens 


A GTP-binding polypeptide 


1017 


100 
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designated RAQ. 






235 


250749 


Homo sapiens 


yeast sds22 homolog 


1800 


100 ™~\ 


236 


Z5Q749 


Homo sapiens 


yeast sds22 homolog 


1754 


98 






Homo sapiens 


PICK1 


2137 


100 


"238" ■ 


AJ270205 


Entodinium 
cauda turn 


putative 

phosphatidyl inositol -4- 
phosphate 5-kinase 


114 


37 


239 


nnm ft i a q 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


93 


240 


W5653B 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 




Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 


243 


AP155107 


Homo sapiens 


NY-REN-37 antigen 


1005 


100 


244 


AL031320 " 


Homo sapiens 


dJ20N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


763 


99 


245 


U3 7026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 




AIM 78599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


"2391 


98 


247 


U32274 


Saccharomyce 
s cerevisiae 


Ydr3 8Swp; CAi: 0.12 


191 


37 


248 


Y41719 


Homo 
sapiens 


Human PR0864 protein 
sequence . 


1079 


100 


24 9 


AB029434— 


Homo sapiens 


ghrelin precursor 


611 


100 


250 


X97B31 


Rattus 
norvegicus 


carnitine/acylcarnitine 
carrier protein 


"246 


"38 


O tz t 


WB0993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 

->cX 


W59878 

~Kr -SdAC -a n 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIP-2 (HEBGM49) . 


765 


100 




ALJ54533 


Leishmania 
major 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


h '5s 




Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO:l. 


"2247 


99 


257 




Arabidopsis 
thaliana 


putative amino acid transport 
protein 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 . 


1171 


160 


"259 - 


AL035^89 


Homo sapiens 


CU18 7J11.1 (novei protein 
similar to protein kinase C 
inhibitors) 


974 


100 


260 
261 


AE000909 
AL050131 


Methanobacte 
rium 

ophicum 
Homo sapiens 


serine/ threonine protein 
kinase related protein 

hypothetical protein 


363 


30 1 


262 
263 


AF019661 
AL035593 


Mus musculus 
Homo sapiens 


zeta proteasome chain; PSMA5 


626 
1214 


100 
100 


264 
26S 


AL02231B 
AF205940 


Tfomo sapiens 
Homo sapiens 


CUJ310J6.1 (novel- protein) 
bK150C2.3 (PUTATIVE novel 
protein similar to AP0BEC1) 
endomucin 


821 
1072 

1289 


100 
100 

100 


266 
267 


AL023583 
AL03454 8 


Homo sapiens 
Homo sapiens 


dJ5Q0L14.1 (novel protein) 
dirno3G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


789 

1888 j 


iob 

99 
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SEQ 
ID 
NO: 
268 


ACCESSION 
NUMBER 

AF161470 


SPECIES 
" Homo sapiens 


DESCRIPTION 
HSPC121 1 


SMITH- 
WATERMAN 
SCORE 
1B84 


IDENTITY 
98 


269 
270 

271 


AF16147Q 
~ X90763 

• AF207600 


Homo sapiens 
* Homo ~ 
sapiens 
Homo sapiens 


HSPC121 ' " " 
HHa5 hair keratin type I 

ethanolamine kinase 


1232 
" 2190 


96 
99 


"2 72 ■ 
273 


M32334 
AF161483 


Homo s ap iens 


intercellular adhesion 
molecule 2 


1952 
1436 


100 
100 


274 


Y53052 


Homo sapiens 


Human secreted protein clone 
df202_J3 protein sequence SEQ 
id noTi in 


663 
587 


61 
100 


276 


Y77S76 


Homo sapiens 


Human cytoskeletal protein 
(HCYT) {clone 2195418) . 


762 


100 


277 


AF077042 


nuuiu oapxens 


3 OS ribosomal protein S7 
homolog 


1269 


100 


278 


Y94967 




Human secreted protein clone 
cal06_19x protein sequence 
SEQ ID NO; 20. 


1619 


98 


279 


Y68788 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 


2801 


-•99 


280 


Z75134 


Can is 
familiar is 


rod transducin 


1816 


100 


281 

282 
283 


Z7 513"4 

AF249873 
ALO5O0O7 


Can is " 
familiaris 
Homo sapiens 
Homo sapiens 


rod transducin 

muscle-specific protein 
hypothetical protein 


1718 

"1395" 
405 


96 

100 
98 


284 
285 
286 

287 
288 


AF2ul931 
AF156102 
Y35897 

u*8 8964 
AL05O143 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


DCl ' " 

ELL complex EAP30 subunit 
Extended human secreted 
protein sequence , SEQ ID NO. 
146. 

HEM45 : " 

hypothetical protein 


1859 
1318 
"1250 " " 

923 


99 
99 
99 

100 


289 
290 

291 


AJ011098 
Y6S724 

Afr034B01 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


telethonin 

Membrane -bound protein 

PR0836. 

Iiprin-alpha4 


598 
574 
'2321 

2565 


100 
100 
100 

98 


292 
293 


AF034801 
AL049851 


Homo sanifkna 

Homo sapiens 


liprin-alpha4 

dJ889J22B.i (novel protein 
(isoform 1) ) 


2590 
1738 


100 

100 " ' 


294 
295 


V73348 
L11672 


Homo sapiens 

Homo ssnl *»na 


HTRM clone 839^51 protein 
sequence . 

zinc finger protein 


1245 


99 


296 


AL035423 


Homo sapiens 


ttJ2bi3.1 ibrain mitochondrial 
carrier protein-l (BMCPi) ) 


1694 
1024 


44 

79 


297 
298 


AF198532 
AF161417 


Homo sapiens 
Homo sapiens 


lymphoid enhancer binding 

factor- 1 

HSPC299 


2173 


100 


299 


AF159141 


Homo sapiens 


breast cancer metastasis- 


1147 
1236 


85 

99 ' 


30D 


U26397" 


Rattus | 
norvegicus 


inositol polyphosphate 4- 
oho^nfiA Ha ra 


160 


30 


301 
302 


AF036145 
Z82022 


Homo sapiens 
Homo sapiens 


meningioma- expressed antigen 
s 

GlcNac-l-P transferase 


3458 


100 


303 


AF26"9232 


mus musculus 


butyrophilin-like protein 
BUTR-l 


2067 
271 


99 
50 


3 04 


AJ222644 


Arabidopsis 
thaliana 


asparaginyi-tRNA synthetase 


659 


50 


305 
306 


AF054180 
AJ272079 


Homo 
sapiens 
Homo sapiens 


hematopoietic cell derived 
zinc finger protein 
APOBEC-1 stimulating protein 


35i 


"79 


308 
309 


Y44486 
AJ131891 


Komo 
sapiens 
nomo sapiens 


Human GPRW receptor 

polypeptide. 

3NA polymerase mu 


3056 
1721 

2598 ■ 


100 
100 

ido - 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


310 


AF293335 


Homo sapiens 


p30 DBC 


1248 


92 


311 


AF176525 


Mus musculus 


F-box protein FBL12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z36715 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


100 


316 




Homo 
sapiens 


Membrane -bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-1. 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AFlctli62 


Homo sapiens 


HSPC099 


224 


40 


320 


Y63773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector phsp-5. 


2243 


99 


321 


AJ23B379 


Homo sapiens 


putative TH1 protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 


323 


Y95013 


Homo sapiens 


Human secreted orotein 
vc48_l, SEQ ID *NO:66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


197^ 


100 


325 


V94944 


Homo sapiens 


Human secreted protein clone 
bfl57 16 tirotain seempnrp 
SEQ ID NO: 94 . 


2305 


98 


326 


Y76884 


Homo sapiens 


Rstinoblastoma binding 
protein- 7 sequence . 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 

ean*! on el 
t?<X^> *.\Sl\o J 

>R65207 
R65207 02- 
MAR-1995 27- 
AUG- 1993 

Human 

stromalin-1 . 

[Homo 

sapiens 


nuclear protein SA-l 


6492 


99 


331 


ALQ08583 


Homo sapiens 


dJ327Ji6.3 (supported by 
GENSCAN, FGENES and GENE WISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


ild 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the ! 154 
Eimeria tenella gene etlOO 1 ' 


26 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/l) sequence. 


3386 


97 


337 


Y85564" 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


2602 

f 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53 /l ) sequence . 


3447 


98 


339 ' 


Z66561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor- 3 


2761 


99 


341" 


G01946- 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: S027. 


465 ■ 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 ■ 


84 
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to 

NO : 


ACCESSION 
NUMBER 


SPECIES 


| DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








VDJ region 






344 


U10281 


Sub scrofa 


gastric mucin 


279 


24 


345 


AK000404 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


1949 


84 


347 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


" 2363 


91 


■a a a 


AL049481 


Arabidopsis 
thaliana 


AIGl-like protein 


316 


30 




AJ251516 


Mus musculus 


cysteine and histidine-rich 
protein 


1460 


39 


351 


AK024477 " 


Homo sapiens 


FLJ0007Q protein 


1773 


100 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AX000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 * 

T7= 


AF161420 


Homo sapiens 


HSPC302 


2623 


97 


35a 


AJ010014 


Homo sapienB 


M96A protein 


1269 


47 


355 


AF1d1029 


Homo sapiens 


HSPC195 


941 


91 


3*7 


AL022327 


Homo sapiens 


dJ355C10.1 {KIAA0027) 


1911 


100 


3S8 


W7812B 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


359 


X03414 


Drosophila 
melar.ogaster 


Kr polypeptide 


316 


45 


360 


AF151079 


Homo sapiens 


HSPC245 


643 


100 


361 


YS3886 


Homo sapiens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 


" 530 


41 


362 


AF254741 


Drosophila 
melanogaster 


Centaurin Gamma 1A 


681 


46 


"363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF181562 


Homo sapiens 


proSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 


proSAAS 


1024 


99 


366 


U7i200 


Mus musculus 


pll6Rip 


884 


82 


367 


AF263744 


Homo sapiens 


erbb2 -interacting protein 
ERBIN 


4973 


93 


368 


U37501 


Mus mus cuius 


laminin alpha 5 chain 


"1*867 


72 


369 


AF043695 


Caenorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


549 


36 


370 


Y73440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102. 


1484 


99 


3 71 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3927 


100 


373 


Y73345 


Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


formimino transferase 
cy cl odeaminase 


2717 


98 


375" 
"TtZ 


A95106 


unidentified 


RED ALPHA 


1202 


99 




1)974828 


Komo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQA352 . 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


35S6 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


379 


AF090934 


Homo sapiens 


PRO0518 


382 


100 


380 


X66363 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


3B1 


Y41699 


Homo 

can i on a 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174499 


Homo sapiens 


GR AF-i specific protein 
phosphatase 


7008 


98 


383 


U64608 


caenorhaJbdi c 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


246 1 


36 


"TeT " 


U50133 


Homo sapiens 


ankyrin 


502 


33 


385 


AJ238B26 " 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


387 


AF208845 


Homo sapiens 


BM-003 


1375 


T"99 


389 


X57B21 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


"76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 




Homo sapiens 


Human horoologue of UNC-53 
(Hs-UNC-53/1} sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


"1614 




395 


AF1B1721 


Homo sapiens 


RU2S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence o£ a 
human beta IV- spectrin 
protein. 


1626" 


98 


397 


U4 8238 


Mus musculus 


zinc finger protein neuro-d4 


749 


60 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxogiutarate 
dehydrogenase ; similar to 
Q02218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


10246 1 


"62 


403 


AL133288 


komo sapiens 


dJ*71D7.1 (similar to 
D. melanogaster CG5986 
protein) 


761 


100 


404 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


888 


48 


405 


Z78013 


Caenorhaodit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


46<J 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY-REN- 36 antigen 


1168 


iob 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


trichohyalin 


184 


30 


410 


AF249744 " 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 


AL031*SI 


Homo sapiens 


dJ310O13.7 (novel protein 
similar to H. roretzi HRPET- 
3) 


776 


98 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029824 


Homo sapiens 


3 -methyl crotonyl - CoA 
carboxylase biotin-containing 
subunit 


2961 


99 


416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmania 
major 


possible t25fl7.21 


239 


35 


418 


Y08100 


Homo sapiens 


Human PR0331 protein. 


330 


29 


419 


U15131 " 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo aapiens 


LinJc guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


urosopnila 
melanogaster 


anxyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phospho inositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL137530 


Homo sapiens 


hypothetical protein 


433 


94 ~ 


424 


X63753 


Homo sapiens 


son- a 


7269 


100 


"425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 
precursor j 


1084 


55 
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NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* - 
IDENTITY 


427 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


12S9 


" 56 " " 


428 


' AE003683 


Drosophila 
me 1 a noga s t e r 


CG8312 gene product 


149 


29 


429 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 


43d 


AF096897 


drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


septin 2-like cell division 
control protein 


2284 


100 


434 


AB006697 


Arabidopsis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 | 


438 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl trans Z era 
se 


1075 


63 


439 


AF105228 


Bos taurus 


tuftelin 


285 


33 


440 


R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 40553) . 


3073 


99 


441 


X14971 


Mus musculus 


alpha-adaptin (a) (aa 1-977) 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha-c large chain (AA 1- 
938) 


3979 


81 


443 


Y46689 " 


Homo 
sapiens 


Membrane-bound protein 
PR01136. 


3299 


99 


444 


AC067754 


Arabidopsis 
thaliana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


piL 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexilin 


26^2 


85 


447 


AF132484 


Mus musculus 


unknown 


478 


51 


448 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


528 


45 


"449 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


450 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


951 


49 


4 si* 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


32 


4S2 


W85727 


Homo 
sapiens 


Novel protein (Clone 
BM46_10) . 


2799 


99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


454 


D87438 * " 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 


Z15005 


Homo sapiens 


CENP-E 


13305 


99 


457 . 


M59216" 


Homo 
sapiens 


gamma- ami nobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd6l_i protein sequence SEQ 
ID NO:156. 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLPM29 . 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


46*1 " " 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


004044 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens 


F25965 1 " 


1018 


100 


464 




Rattus sp. 


7acomp protein 


1845 


84 


465 


AF223408 


Homo sapiens 


B39 -. ■ - 


3666 


99 
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SEQ 
ZD 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


466 


AF223408 


Homo sapiens 


B99 


2878 


87 


"467 


AF104415 


Mub musculus 


gene trap locus -13 


6336 


91 


468 


U53450 


Rattus 
norvegicus 


Jun dimerization protein 1 
JDP-i 


196 


49 


469 


AL031297 


Homo sapiens 


CUJ97P20.1 (novel gene) 


3564 


99 


470 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


beta transducin-liJce protein 


284 


■"3 8 - 


472 


YB4903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


' 100 


4 73 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bovel disease 
related polypeptide IMX39. 


838 


100 


475 


Y95006 


Homo sapiens 


Human secreted protein 
vel3jL, SEQ ID NO: 52. 


3411 


100 


476 


D3BS49 


Homo sapiens 


hal025 is new 


6533 




477 


AF241230 


Homo sapiens 


TAKl-bmding protein 2 


3656 


100 


47a 


ALD31S34 


Schizosaccha 

romyces 

pombe 


putative asparagine synthase 


482 


40 


479 


L28125 


Podospora 
anserina 


beta transducin-iiJce protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 ! 


481 


AJ238248 


Homo sapiens 


centaurin beta2 


"3986 


99 


482 


Z38061 


Saccharomyce 
9 oerevisiae 


mal5, stal, len: 1367, CAI : 

0.3, AMYH_YEAST P08640 

G LUCQAM YTiAS E SI (EC 3.2.1.3) 


295 


23 


483 


AF161381 


Homo sapiens 


HSPC263 


1404 


100 


464 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X$7'S27 


Homo sapiens 


alpha l(VIII) collagen 


4166 


99 


4S7 


Y19062 


Homo sapiens 


39k3 protein 


2475 


100 


488 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence . 


555 


56 


489 


AL021918 


Homo 
sapiens 


b34I8.l (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


490 


X53773 


Rattus 
norvegicus 


alpha -c large chain (AA 1- 
938) 


4675 


97 


491 


U52426 


Homo sapiens 


GOK 


1459 


59 


492 


AL359773 


Leishmania 
major 


possible threonine synthase 


702 


45 


493 


AF22*6l4 


Homo sapiens 


ferroportini 


2929 


100 


494 


Z93241 


Homo sapiens 


dJ222Bl3.l (novel protein 
with some similarity to 
Drosophila KKAKKN) 


513 


96 


495 


AF036977 


Homo sapiens 


unknown ^— — — — — 


1812 


100 


496 


U93564 


Homo sapiens 


p40 


133 


45 


497 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126. 


357 


100 


498 


AF059781 


Drosophila 
melanogaster 


Bem46-like protein 


6S3 


43 


499 


Y16601 


Homo sapiens 


Human cell -cycle 


1658 


98 


500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 


AF027503 


Mug 

museiil ne 

uiugw UJLUb 


putative membrane-associated 
guanylate kinase 1 


205 


36 


502 


AF262874 


Homo sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


G8 protein 


669 


100 


504 
505 


AF208861 
L09708 


Homo sapiens 
Homo sapiens 


BM-019 

complement component C2 


1629 


100 


507 


X66285 


Ymb musculus * 


HC1 ORF 


4022 
115 


100 
43 


508 


D00189 


Rattus 
norvegicus 


Na+,K+-ATPase alpha- subunit 


5227 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


509 

510 
511 


Y94971 

ABO19038 
AB019038 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein clone 
fal7i_i protein sequence SEQ 
ID NO: 14 8. 

beta-1,4 mannosyl transferase 
beta-1,4 mannosyl transferase 


2176 
781 


100 

77 


512 
513 


AB019038 
X84908 


Homo sapiens 
Homo sapiens 


beta-1,4 mannosyltransf erase 
phosphorylase kinase 


1347 

isio 

5729 


100 

99 

99 


514 


X55851 


Homo sapiens 






76 


515 


AF186084 


Homo 
sapiens 


epidermal growth factor 
repeat containing protein 


3046 


99 


516 


G03602 


Homo sapiens 


Human secr^t'pri nrnhpfn cda 
ID NO: 7683. 


505 


99 


517 


U04706 


Bos taurue 


50 kDa protein 


1749 


77 


518 
519 


G00653 
AF161475 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4734. 

HSPC126 " 


530 


100 


520 
521 


Y99366 
AF266652 


Homo sapiens 
Homo sapiens 


Human PR01475 (UNQ746) amino 
acid sequence SEQ ID NO: 88. 
PTPLA 


1368 
3394 


100 
97 


522 


AE000995 


Archaeoglobu 
8 fulgidus 


chromosome segregation 
protein (sraci) 


1295 
153 


100 
20 


523 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 
variable region 


605 


57 


524 


AJ223B30 


Rattus 
norvegicus 


ARE1 


2950 


98 _ 


525 


W01535 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


127* 


83 


526 


AF145*5B 


Drosophila 
melanogaster 


_ BcDNA.GH10229 


320 


33 


527 


"AF112213 


Homo sapiens 


putative Rab5- interacting 
protein 


524 


79 


523 


D49387 


Homo 
sapiens 


NADP dependent leukotriene b4 
1 2 - hydroxydehydrogenase 


i<ii<; 


100 


529 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9. 


328 


32 


530 


AL079335 


Homo sapiens 


dJl32F21.3 <72.i KDa protein 
(DKFZP564A03 2, SBBI88) 
similar to mouse IFN-gamma 
induce MG11. ) 


1059 


99 


531 


Y9150* 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179 . 


1159 


98 


532 


X76116 


Caenorhabdit 
is elegans 


carrier protein <c2) 


576 


50 


533 - 


X76116" 


is elegans 




506 


50 


534 


X12966 


Homo sapiens 


3-oxoacyl-CoA thiolase 
propeptide (424 AA) 


1972 


100 


535 

536 
537 
538 


Y09267 

Z11713 
D84254 
D84224 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens " 


flavin- containing 
monooxygenase 2 
SRE-ZBP 

methionyl tRNA synthetase 
methlonyl tRNA synthetase 


2486* 

2201 7 
4741 


100 

99 
99 


539 
540 
541 

542 


D84224 
D84224 
J03244 

Y92514 


Homo sapiens 
Homo sapiens 
Bos taurus 

Homo sapiens 


methionyl tRNA synthetase 
methionyl tRNA synthetase 
H+ ATPase 31kDa subunit (EC 
3.6.1.3) 
Human OXRE-ll. 


3887 
2933 
4529 
848 


$9 
96 
99 

77 


543 " 

"rn 




Homo 
sapiens 


Smad- and Olf-interacting 
zinc finger protein 


2301 
2151 


99 
61 




AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 1 


207 


38 


545 


A06669 


synthetic 
construct 


preTGF-betal 

j 


2070 


99 j 
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SEQ 
ID 

NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 


54 6 


Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 


98 


547 


AF112205 


Homo sapiens 


VJSB-l protein 


2275 


100 


548 


X60271 


Mus musculus 


c-rel 


2264 


74 


549 


AC016827 


Arabidopsis 
thaliana 


putative GTPase 


810 


42 




Y704d0 


Homo 
sapiens 


Human cell- signalling 
protein- 2 . 


429 


' *8 


551 


AB048365 


Homo sapiens 


NEDD4-like ubicuitin ligase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4. 


" 1112 


95 


553 


AF119855 


Homo sapiens 


PR01847 


265 


67 


554 


M17236 


Homo sapiens 


MHC HLA-DQ alpha precursor 


1332 


100 


555 


" A&078468 


Arabidopsis 
thaliana 


putative protein 


540 


40 


556 


AC006S63 


Homo sapiens 


similar to Kelch proteins,- 
similar to BAA77027 
<PID:g46S0844) 


515 


44 


557 


AK024487 


Homo sapiens 


FLJO0086 protein 


1623 


98 


558 


M12140 


Homo sapiens 


pol gene protein; Xxx 


117 


48 


559 


W74 825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 


225 


56 


560 


X56681 


Homo sapiens 


}unD protein 


373 


8 8 1 


561 " 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2926 


54 


562 


AL109839 


Homo sapiens 


dJ1069P2.3.1 (novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AP181640 


Drosophila 
melanogaster * 


BCDNA.GH09817 


289 


42 


564 
"p?g 


AF052723 


Feline 

leukemia 

virus 


gag-pol precursor polyprotein 
gPr80 


1^47 


43 


bob 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


"569 


AF155113 


Homo sapiens 


NY-RBN-55 antigen 


3603 


93 


570 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3951 


99 


571 


AL032821 


Homo sapiens 


dJ55C23.1 (vanin 1) 


1821 


98 


572 


M691B1 


Homo sapiens 


non-muscle myosin B 


7350 


99 


"573 


M69181 


Homo sapiens 


non-muscle myosin B 


7311 


98 


574 
"575 


Y5967B 


Homo sapiens 


secreted protein 108-008-5-0- 
E6-PL . 


772 


100 




AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


576 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 


"X0^745 


Homo sapiens 


DMA polymerase alpha -subunit 
(AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


100 


579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 


100 


580 


AF165124 


Homo sapiens 


gamma -aniinobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 


2339 


99 


582 
~*83 


U82319 


Homo sapiens 


novel ORF 


*42 


100 




P32219 


(human) 


\-Ki protein . \ 


11425 


99 


564 


AJ223948 " 


Homo sapiens 


RNA helicase 


6608 


99 


585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


3874 


99 


_ 587 


Y42384 


Homo 7 
sapiens 


Amino acid sequence of 
Iv3l0 7. 


1007 


37 




AF129756 


Homo sapiens 


BAT4 


1873 


98 
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TABLE 2 
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SEQ"~ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


" sMXth- 

WATERMAN 
SCORE 


% 

IDENTITY 


588 
589 


AFlll775 
AJ2S0865 


Homo sapiens 
Homo sapiens 


Unknown 
TESS 2 


1929 
' 2348 


99 


591 


Z98885 


Homo sapiens 


CJJ522J7.2 (bromodomain- 
containing 1 (similar to 
peregrin, BR140)) 


4167 


100 
100 


"592 
593 


L74571 
AF091622 


Homo sapiens 
Homo sapiens 


nuclear hormone receptor 
PHD finger protein 3 


1355 


100 


594 


XS6807 


Homo sapiens 


desmocollxn type 2a 


9054 
4443 


100 
■ 100 


595 


AL137802 


Homo sapiens 


CU798A10.1 (novel protein) 


212 


55 


596 


AL022329 


Homo 
sapiens 


t>K407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


Homo 
sapiens] 
>Y49635 
Y4 9635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein, 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 
600 


Y59741 
L36531 


Homo sapiens 
Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 
integrin alpha 8 subunit 


1574 


99 


601 


"Y38458 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


5386 
895 


99 
100 


602 
403 


AF218584 


Homo sapiens 


GGA1 


3265 


100 


604 


Y1311S 
AL132 776 


Homo sapiens 
Homo sapiens 


serine/ threonine protein 
kinase 

dJ393D12.1 (KIAA0776) 


5071 


99 


60S 
604 


AL034452 


Homo sapiens 


dJ6B2J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


2413 
1979 


99 
100 




Y14494 


Homo sapiens 


aralari 


3465 


99 


407 


AJ001981 


Homo sapiens 


OXAIL 


2603 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


3049 


100 


610 
611 


AF163 572 
AF161503 


Homo sapiens 
Komo sapiens - 


Forosman glycol ipid 

synthetase 

HSPC154 


1665 
1261 


99 
97 


612 
613 


L41834 
Y919S4 


Ens is minor 
Homo sapiens 


nuclear protein 

Human cytoskeleton associated 

protein 9 (CYSKP-9) . 


345 
3448 


30 
100 


614 
615 
616 


AL022327 

X85706 

Y08319 


Homo sapiens 
Homo sapiens 
Homo sapiens 


dJ355C18.1 {KIAA0027) 
binding regulatory factor 
kinesin-2 


361 

3203 

3487 


94 

100 

99 


617 

eid 

619 


D12644 
U28789 
Y35914 


Mus musculus 
Mus musculus 
Homo sapiens 


KiF2 protein 
PACT 

Extended human secreted ' 
protein sequence, SEQ ID NO. 
163. 


3609 
5936 
1684 


97 
89 
99 


620 
421 


A30463B2 


Mus musculus " 


testis-abundant finger 
protein 


199 


23 


622 


Y06662 
A?068286 


Homo sapiens 
Homo sapiens 


precursor polypeptide (AA -23 

to 1120) 

HUCMD38P 


3440 
861 


99 
100 


623 
624 

425"* " 


X98248 


Homo sapiens 
Homo sapiens 


sortilin 

75 kDa subunit NADH 
dehydrogenase precursor 


4436 
3734 


99 
99 


626 


S58544 
AF151027 


Homo sapiens 
Homo sapiens 


75 kda infertility-related 

sperm protein 

HSPC193 


2125 
582 


99 
93 


627 

42d - 


X1496-8 
Y50911 


Homo sapiens 
Homo sapiens 


Rll-alpha subunit (AA 1-404) 
Human fetal brain cdna clone 
vb7 l derived protein 


2079 

1983 ; 


100 
100 
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SEQ 
ID 

NO: 


" ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH"- — 

WATERMAN 
SCORE 


4. 
* 

IDENTITY 


629 


Y50911" 


Homo sapiens 


Human tetal brain cDNA clone 
vb7_l derived protein 


1694 


100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type vil 


1754 


100 


631 


AL03455S 


Homo 
sapiens 


dJ134019.3 (zinc iinger 
protein 151 (pHZ-67)) 


4273 ' * 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


2236" 


.100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


*63S 


X66357 


Homo sapiens 


serine/threonine protein 
kinase 


1589 


100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


637 


ARO04884 


Homo sapiens 


PKU-alpha 


3718 


99 


638 


AJ0.02303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


AJ0023O3 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


similar to a C.elegans 
protein encoded .in cosmid 
T26A5 . 


2$7eJ 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AF119900 


Homo sapiens 


PR02822 


185 


76 


645 


AB031048 


Drosophila " 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi-2 protein 


10110 


"99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homo log 




96 


"649 


AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 




i»l 


-650" 


AL034553' 


Homo sapiens 


dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 
ac t ivi ty- dependent 
neuroprotective protein 
(Adnp) ) 




100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
subunit 


2388 


99 


654 " 


AC004614 - 


Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 


655 


Y5790fl 


Homo sapiens 


Human transmembrane protein 
HTMPN-32. 


608 


99 


6S6 


Z34975 


Homo sapiens 


ldlCp 


3733 


100 


658 


AL050306 


Homo sapiens 


dJ475B7.2 [novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


Mus musculus 


dys£erlin 


4752 


59 


663 


AF182316 


Komo sapiens 


myoterlin 


6232 


99 


665 


AL161516 


Arabidopsis 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y1335S 


Homo sapiens 


Amino acid sequence of 
protein PRO220 . 


36$2 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 
beta-N-acetylglucosaminidase [ 
gene 


611 


52 


671 


X56123 


Kus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial abc transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho- related protein 4 


4053 j 


99 


675 


L144 63 


Rattus 


1 transducin 


3619 | 


92 
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SEQ 

TT% 
11/ 

NO: 


ACCESSION 


SPECIBS 


DESCRIPTION 


SMITH- 

WATRRMAN 
rtr\ l orvriruv 

SCORE 


X JCtiM 1 1 X I 






norvegicus 








476 


AC005757 


Homo sapiens 


R32611 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
hornolog^pol t retroviral 
element) 


252 


65 


678 


AF271388 


Homo sapiens 


CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERP-1 


1783 


100 


680 


AP118566 


Mus mus cuius 


hematopoietic zinc finger 
orotein 


769 


50 


681 


Y51415 


Homo 
sapiens 


Human wild type pKe83 
protein. 


2(S21 


99 






nOiuO sapiens 


ova . jl \novej. procein 
similar to a dual specificity 
phosphatase) 


/ UU 


CO 

bo 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb34l protein 
sequence • 


5888 


99 




I J 4 * ?3i 


Homo sapiens 


Human secreted protein clone 
SEQ ID NO: 110 . 


1 

JOl 


98 






Unmn cani one 


factor 20 ( AR1 1 ( KTAA02Q? 1 
(isoform 2 ) ) 




67 


686 


AE00019S 


Escherichia 
coli 


or*f . hvDOthe t ical r>r*ocei n 


628 


100 


687 


M58378 


Homo sapiens 


By naps £n i 


3730 


99 


688 


AF039697 


Homo sapiens 


antigen NY-CO-31 


508 


98 


689 


U09355 


cuniculus 


gamma subunit 


2356 


99 


690 


API SSI Ofi 




NY-REN- 16 dnfiapn 
i» i iMii'i jo nil u xy en 


265 


50 


491 


AC004774 


Homo sapiens 


Dlx-5 


1542 


100 


692 


X90530 


Homft ctArt^ one 
nuuiLJ oapiciio 






99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 








85 


495 


G01563 


Homo sapiens 


Human secreted protein, SEQ 

Tr» MO « <?£dd 


330 


100 


496 


AC011810 


Arabidopsis 

hVa 1 i ana 


Putative methionine 

jiTni nny>on ^ "i ana 

ClUlXllULJGLJ LlUdpc 


669 


52 " 


697" 


AJ25042S 


Ra t tus 
norveg i cus 


Collvhifltin" T '" 
voixyuiauiu i 




98 


698 


AB037901 


Homo 


gene amplified in squamous 


5364 


99 


699 


Y994 01 


Homo sapiens 


Human PR01327 (UNQ687) amino 
acid sequence SEQ ID NO: 21 8. 


1384 


100 


701 


AF221712 


Homo 
^aniens 


Smad- and Olf- interacting 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


*184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-like 


1697 


94 


705 


Y71262 


Homo aanipnR 


Human rhrtTi^iY*rim.rtrhil 

protein, Zchml . 


1736 


99 


706 


Y41257 


Itfomo sapiens 


Uinfnn ari H eomionra rt^* 1 rimer 
/vitxuu a^f i.u cfdjuciii—e vi xtJiiy 

human FAIM * 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 { PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08698 


Homo sapiens 


ranbp3 


2649 


93 


711 


Y68770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2 . 


754 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


712 


U93574 


Homo sapiens 


putative p!50 


799 


59 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapieti3 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


AL137013 


Homo sapiens 


bA311P8.3 (probable uracil 
phosphor ibosyl t ran f erase ) 


862 


100 


717 


AB035123 


Mus mus cuius 


GDI alpha/GTia alpha /GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


35 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W415G5 


Homo 

saniensl 

>W41564 

N41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpaln. 

[Homo 

sapiens 


Human caipain. 


1591 


99 


7*3 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


72 4 


AF1B7318 


Homo sapiens 


F-box protein Fbx2 


1S07 


100 


725 


AC0067O8 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


1143 


46 


72* * 


AC00<J7O8 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


9B8 


46 


727 


AC024818 


Caenorhabdit 
is elegans 


contains similarity to Pfara 
family PF00400 (WD domain, 
G-beta repeat) , score-81.8, 
E«1.4e-20, N«3 


950 


44 


728 


AJ005897 


Homo sapiens 


JM5 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


97 


730 ' 1 


G03931 


Homo Bapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP-binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No. 8. 


862 


97 


*m 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024 813 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54Fl0AL.a 


152 


24 


735 


At635^61 


Homo sapiens 


dJ967N2l.6 (novel CDP-alcohoY 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae Y0PU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine-tRNA-protein 
transferase 1-lp; ATEl-lp 


273* 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


1 SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


. 2793 


100 


739 


AJ1331Z5 


Homo sapiens 


TSC-22-like protein 


2054 


99 


740 


X9B258 


Homo sapiens 


M-phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 


Caenorhabdic 
is elegans 


strong similarity to the YPT1 
sub- family of RAS proteins 


9<S0 


85 


743 


X7S057 


Homo sapiens 


phosphomannose isomerase 


2191 


100 


744 


G03209 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7290. 


496 


98 


745 


X97064 


Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


74B 


M19S29 


Sus scrofa 


follistatin A 


1906 


98 


749 


AJ249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC004410 


Homo sapiens 


£os39554_l 


2094 


100 


751 


AF074968 


Homo sapiens 


P47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


nbosomal protein L39 


160 


77 


755 


AB008430 


Homo sapiens 


CDEP 


142 


29 


758 


L32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


£25 


100 


761 


AF21B586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


hi stone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


568 


38 


766 


ALG23828 


Caenorhabdit 
is elegans 


Y17G7B.14 


200 


27 


767 


Y 82 777 


Homo sapiens 


Human chordin related protein 
(Clone dw665 4) . 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 | 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3). 


1426 


100 


770 


X5141* 


Homo sapiens 


hormone receptor hERRl {AA 1- 
B21) 


2441 


97 


771 


AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap2 


935 


100 


773 


Z12173 


Homo sapiens 


N-acetylglucosamine~6- 
sulphatase 


2970 


100 


774 


Y919S0 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


776 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


8S5 


56 


778 


G01880 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5961. 


849 


98 " 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


dJ130E4.2 (KIAA0796) 


1321 


68 


Ml"" 


Z7S"955 


caenorhabdit 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJH21G12.2 (SCAN domain- j 
containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


ti038^ 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 
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TABLE 2 



1 §EO 

ID 
NO: 


HfJK, Co o 1 UN 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 








" ID NO- 7954 






7B5 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


786 


Y00918 


Homo sapiens 


Human Rab protein, RABP-1, 
protein sequence . 


104 8 


Q Q 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


154 8 


Q Q 


788 


AB0353B4 


Homo sapiens 


SRp2 5 nuclear protein 


962 




789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006710 


Rat bus 
norvegicus 






97 


792 


V00638 


bant*p5Ti oohao 

X/t*\— L- w J» — UpliuM 

e lambda 


ivauiiivj xx.ci.uit; CalU 


600 


100 


793 


AP0491G3 


Homo «?ani on c 


nunuingtin interacting 


819 


100 


795 


Z26317 


Homo sapiens 


desraoglein 2 


4810 


99 


796 


Y76884 




rve u j. nuDids c Oiud oino l ng 
protein-7sequence . 


5080 


99 


"797" ' " 


U15155 


Ga 11 u s 
gal lus 




J /2 


37 


798 


U97189 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 


800 




Da t- hie 

norvegicus 


serine- arginine- rich splicing 
regulatory protein SRRP86 


958 


63 


801 


t\C ZD/ OSZ 


Homo sapiens 


placental protein 13-like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


on? 




Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma-binding 
protein RBAP46 yk662dl2.5 
comes from this gene 


152 


27 


804 


G02113 




Human secreted protein, SEQ 

ID NO* 

XV IS\J m a X -7 ^ . 


496 


98 


805 


AL121673 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


1C0 


80£ 


AC013483 


Arab idop s i s 
thaliana 


putative ulrdSc aCLlVdCOT 

protein 


264 


30 


807 


AC0134 83 


Arabidopsis 
thaliana 


putative GTPase activator 


264 


3C 


808 


AB013885 


Homo sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 






99 


810 


AF161421 


Homo sapiens 


HSPC303 """" " 


2134 


96 


811 




Homo sapiens 


DNA polymerase epsilon pi 7 
subunit 


734 


100 


812 


274029 


is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


273497 


Unmn flAni Ana 


cu£^uv.4t6 \Lore mscone 
H2A/H2B/H"? /H4 ) 


324 


100 


814 


W87689 


sapiens 




140% 


yy 


816 ■ 


X16*282 j 


Homo j 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus mus cuius 


B9 


1 Q*7 
117 / 




819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


"97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


Sii 


L34807 


Musca 
domes tica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


Schizosaccha 


caffeine -induced death 


184 ■ 


29 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






romyces 
pombe 


protein 1 






825 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


693 


68 


326 


U23037 


Oryctolagus 
cuniculus 


eIF-2Bepsilon 


3406 


90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


626 


Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


2097 


100 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 j 


836 


AF119664 




transcriptional regulator 
protein HCNGP 


1448 


94 


83 7 


X12517 


Homo sapiens 


C protein (AA 1-159J 


918 


100 


638 


U32865 


Drosophila 
melanogaster 


linotte protein 


164 


24 


839 


AF067^!Ju 


Homo sapiens 


TLS-aesociated protein TASR-2 


631 


56 


840 


U27831 


Homo sapiens 


striatum- enriched phosphatase 


2840 


98 


841 


AF266366 


Homo sapiens 


CamKI-like protein kinase 


1796 


100 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6390. 


278 


98 


643 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


629 


100 


845 


U27838 


Mus mus cuius 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homol og 


330* 


M 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


AF164794 


Homo sapiens 


D iff 3 3 protein homolog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


AF192784 


Homo sapiens 


makorin 1 


2062 


97 


851 


Y58628 


Homo sapiens 


Protein regulating gene 
expression PRGE-21. 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


**80 


106 - 


854 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 


96 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


203 


100 


656 


AF285118 


Homo sapiens 


CGI-203 


452 


100 


857 


AC006"069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specif ity 
factor 


1383 


55 


858 
oca 


AL021546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Via- liver 
precursor (EC 1.9.3.1) 


593 


100 






Xenopus 
laevis 


nhonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner 1 


616 


100 


861 


L31783 


Mus ntusculus 


uridine kinase 


1266 


52 


B62 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


1863" 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


864 " 


AF154108 


Homo sapiens 


tumor necrosis factor type l 


3559 


99 
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SEQ 
ID 
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ACCESSION 
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SPECIES 


DESCRIPTION 


"" "SMITH- 
WATERMAN 
SCORE 


IDENTITY 


865 


AE001530 


Helicobacter 
pylori J99 


receptor associated protein 
putative 


230 


ST. 


866 


X57807 


Homo sapiens 


immunoglobulin lambda light 
chain 


£99 


91 


B67 
'868 


AL031673 
Y11652 


Homo sapiens 
Homo sapiens 


dub94B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 
type Zinc finger domains) 
phosphate cyclase 


4066 


""99 


869 


AF192968 


Homo sapiens 


ii-iyu giucuae-reguiaLefl 
protein 8 


238 
3041 


1.00 
99 


870 
871 


AB020648 
AL031427 


Homo sapiens 
Homo sapiens 


KIAA0841 protein 
dJ167A19.1 (novel protein) 


3237 
1608 


S9 
100 


872 
873 

074 


AF151534 
AU)2133l 

X14608 


Homo sapiens 
Homo sapiens 

Homo sapiens 


core histone macroH2A2.2 
dJ366N23.1 (putative C. 
elegans UNC-93 (protein 1, 
C46F11.1) hiKS protein) 


1844 
1129 


100 
100 


875 


AL117334 


Homo sapiens 


propionyl-CoA carboxylase 
CU6 87F11.1 (novel protein 
(part of translation of cDNA 
DKFZp434N06l, Em:AL110249) ) 


3579 
" 306 


100 
100 


876 


X79489 "' 


Saccharomyce 
s cerevisiae 


E-925 protein 


446 


35 


877 
878 


YS3001 
AF231064 


Homo sapiens 
Homo sapiens 


Human secreted protein clone 
dn834_i protein sequence SEQ 

LU NU : o , 

CHMP1.5 


811 
957 


100 
100 


879 
880 


X79417 
AF001317 


Sue scrofa 
Saccharomyce 
s cerevisiae 


40S ribosomal protein S12 
Soilp 


687 
478 


100 

28 " '" 


881 
882 


Y67275 
M14036 


Homo sapiens 
Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO: 52 . 
CI -inhibitor 


2547 


100 


"883 
'884 ' 


AB041261 
AF020313 1 


Mus nmsculus 


calcium- independent 
phospholipase A2 
proline-rich protein 48 


598 
2903 


77 
100 

84 


885 
886 


Y1093£ 
AF073997 


Homo sapiens 
Mus mus cuius 


hypothetical protein 
myotubularin related protein 
1 


1104 
B66 


99 
36 


887 
888 


Y57893 
AL117635 


Homo sapiens 
Homo sapiens 


Human transmembrane protein 
HTMPN-17. 

hypothetical protein 


1099 


94 


889 

890 " 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


929 
2046 


99 
99 




Y36031 




Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


891 


Y36031 


Homo sapiens 


Extended humen secreted """" 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 
893 


AF237631 
AF090929 


Homo sapiens 
Homo sapiens 


ubiquitous tropomodulin U- 

Tmod 

PR00477p 


1798 


100 


894 
89* 


AL031228 




dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


653 
3196 


99 
100 


896 


AL031228 
AF171X02 


Homo sapiens 
Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 
retinal degeneration B beta 


2825 


96 


897 


AE003551 


urosophila 
melanogaeter 


CG18176 gene product 


1365 
633 


95 
33 
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SEQ 
10 

NO: 


ACCESSION- 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


838 


AJ237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


899 


Z97184 


Homo sapiens 


EKE 2 


624 


100 


900 


Z97184 


Homo sapiens 


KKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


Ij04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophila 
melanogaster 


CG10984 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W34085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 " 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


219* 


100 


911 


G02871 " 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


S21 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243, 


387 


87 


913 


AJ243721 


Homo 
sapiens) 
>Y92508 
Y92508 13- 
APR- 2 00 0 06- 
OCT-1998. 

Unman nYDU _ 

5 . [Homo 
sapiens 


dTDP - 4 - keto - 6 -deoxy~D- glucose 
4 -reductase 


1710 


100 


914 


U24189 


i^afsnornaoair 
is elegans 


hypothetical protein 1207-1; 
Method : conceptual 

authors 


^44 

l 


41 


915 


Y02591 


nuinu sapiens 


21 human nrnrftt c or/>n a **»a*f£%r*\^ 

/i iiuin&u \jl cjyejauerone recepcor 

LUUipiCA piUUClil , 


843 


99 


915 


AE000984 


Archaeoglobu 
s f ulgidus 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


913 


M23159 


Cricetus 
crice tus 


DHFR-coamplif ied protein 


163 


30 


919 


L12018 


Caenorhabdit 
is elegans 


putative 


1232 


41 


920 


AF102177 


Homo sapiens 


tumor antigen SI/P-8p 


1260 


97 


921 


AL096712 


Homo satsienR 


dJ744I24 2 (similar to a 
novel human gene mapping to 
Activator) 


1017 " ' 




922 


AL161495 


Arab idop sis 
thaliana 


putative WD-repeat protein 


866 




923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Ca e norhabdi t 
is elegans 


fldmilaT* t"o 

Schizosaccharomyces pombe 


605 


51 


925 


X71978 


Mus mus cuius 


Fif 


1503 




926 


K92288 


Drosophila 
melanogaster 


beta- spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose-5-phosphate- 
epimerase 


912 


10 0 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 



166 
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NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






is elegans 


cm21c7 






932 


AL080065 


Homo sapiens 


hypothetical protein 


210 


25 


933 


G01384 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5965. 


767 


98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


" 1200 


100 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 


1142 


" "80 


93 6 


AB026808 


Mus raus cuius 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRIHFB22I6 


2601 


""99 


938 


X65724 


Homo sapiens 


ORF2 


498 


100 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


1487 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 


AF094S83 


Homo sapiens 


putative hiv-i infection 
related protein 


4 52 


100 


942 


AC024200 


Caenorhabdit 
is elegans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 


350 


69 


943 


AF129756 


Homo sapiens 


GbC 


273 


100 


944 


K2376S 


Rattus 
norvegicus 


alpha - 1 r opomyos in 


133 


96 


945 


AC009917 


Arabidopsis 
tha liana 


Contains similarity to 


583 


47 


946 


AF223468 


Homo sapiens 


AD021 protein 


551 


44 


947 


AF0S5473 


Homo sapiens 


GAGE -8 


273 


51 


94 8 


X7575£ 


Homo sapiens 


protein kinase C mu 


2019 


*8 


949 


AF143956 


Mus mus cuius 


corcnin-2 


2300 


93 


950 


Y36729 


Homo 
sapiens 


Human PG1 protein sequence. 


186*1 


99 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


202 


67 


952 


AB016881 


Arabidopsis 
thaliana 


gene_id:MXCl7.7~ 


203 


46* 


953 


Y0178S 


Homo sapiens 


Human ubi qui tin -conjugating 
enzyme >Y25341 Y25341 01-JUL- 
1999 12-AUG-1998 Human NCB-2 
protein. 


3^S 


100 


954 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


823 


46 


955 


U09410 


Homo sapiens 


zinc finger protein ZNF131 


2463 


99 


956 


U09410 


Homo sapiens 


zinc finger protein ZNF131 


1853 


99 


957 


AF195623 


Homo sapiens 


cholinephosphotransferase 1 
alpha 


2126 


99 


958 


X94917 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 Jcb 


155 


32 


959 


U54807 


Rattus 
norvegicus 


GTP- binding protein 


1167 


97 


960 


AF058807 


Bos taurus 


GTP-oxndmg protein rah 


606 


97 


961 


G03244 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


471 


100 


962 


AF078850 


Homo sapiens 


steroid dehydrogenase homolog 


583 


40 


963 * 


AP001754 

IVT.n^^AT Q 


Homo sapiens 


transient receptor potential- 
related channel 7, a novel 
putative Ca2 + channel protein 


.317 


30 


964 




Homo sapiens 


dJ1100H13.I (putative novel 
protein) 


1129 


100 


965 


X61301 


kattus 
rattus 


interferon- induced protean 


202 


46 


966 


D3B169 


Homo 
sapiens 


inositol 1,4,5-trisphosphate 
3-kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ^5N24.2.1 (PUTATiVE" novel ' 
protein) (ieoform 1) 


893 


100 
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TABLE 2 



SEQ 
ID 

NO: 


■ ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


^ 

IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


"611 


100 


969 


AJ0L1306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (long isoform) 


2752 


99 * 


970 


AF281134 


Homo sapiens 


exosome component Rrp46 


1186 


100 


971 


U5333* 


Caenorhabdit 
is elegans 


weak similarity over a ohort 
region to myosin heavy chain 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840.12 


589 


S3 


973 


AP188504 


Mus musculus 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


B52 


98 


975 


AP049523 


Homo sapiens 


huntingtin-interacting 
protein HYPA/PBP11 


1390 


97 


976 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 ! 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


978 


AP164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


094991 


Xenopus 
laevis 


transcription factor XLMOl 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrine 


2029 


100 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rh'odococcus 
sp. AD45 


on fca t" l v#» r*a renu op 


» CI 


43 


985 


Z30093 




basin t* ran sni-S r*t" i on f arf- n*~v "? 

35 kD subunit 


13 /O 


99 


9*4 


AB030835 




Conhaina two oluhaminfl r i <""Vi 
wutti,a alio t>ww yjt wi>di(iAUt3 cicn 

domains , three zinc- finger 
domains i and matrin 3 
homologous domain 3 (MH3) 


A COT 


"no 


987 


AF227258 


Bos taurus 


RPGR- interacting protein-1 


1262 


38 


988 


AL022238 


Homo sapiens 


dJ1042Kio.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


4048 


99 




AL022238 


Komo sapiens 


dJ1042K10.2 {supported by 
GENSCAN, FGENES and GENEWISE) 


2321 


99 


990 


AP161426 


Homo sapiens 


HSPC308 


448 


92 


991 


AF161426 


Homo sapiens 


HSPC308 . 


448 


92 


"992 


AF16142S 


Homo sapiens 


HSPC308 


453 


92 


993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna-splicing endonuclease 
subunit 


172 


42 


394 


AL049631 


Homo sapiens 


dJ513M9.1 (novel Homeobox 
domain protein) 


241 


47 


995 


ACO05253 


Homo sapiens 


R26445JL " 


902 


100 


996 


AF265206 


Homo sapiens 


M0G1 isoform A 


974 


100 


997 


AJ248285 


Pyrococcus 
abyss i 


sarcosine oxidase, subunit 
beta (soxB) 


195 ; 


28 


998 


AE0G3641 


Drosophila 
melanogaater 


BG:DS00941.3 gene product 


21B 


SB 


999" 


W69343 


Homo 
sapiens 


Secreted protein or 1 clone 
CR930 1. 


1340 


93 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 




inn 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


'1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 — 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 
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SEQ 
ID 
NO; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1006 


S453^V 


Can is 

farailiaris 


centractin 


1315 


98 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 " 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BCDNA.GH10333 


1244 


52 


1015 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


97 


1017 ■ 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO: 374. 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n-chim&erin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


sex-regulated protein janus-a 


674 


100 


1021 


AF190625 


Coturnix 
coturnix 


qdgl-1 


638 


96 


1022 


AL133363 


Arabldopsis 
thaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


"2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


102S 


U8073* 


Homo sapiens 


CAGF9 


16S7 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment o± human secreted 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 " 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


"190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc tinger 
protein; this is a splicing 
supplied by author 


19* 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein ! 
BA0306. 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiquitin-like protein 8 


331 


80 ; 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1040 


AF2S0204 


Homo sapiens 


blood group carrier molecule 
DOK1 


1637 


99 


1041 


Y96730 


Homo 
sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 


AF140683 


Mus musculus 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BcDNA.GH04 929 


204 


37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6-phosphogluconolactonase 


1317 


100 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Homo sapiens 


dCU184F4.2 (novel protein 
similar to nucleolar protein 
4 {N0L4) (NOLP)) 


981 


92 


1049 


AF163B25"' 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AP201949 


Homo sapiens 


60S ribosomal protein L30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl - l 


236 


85 


1052 


AE003529 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 


1054 


AL162756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

ami dot ran sf erase subunit A 


682 


44 • 


1055 


XF181856 


3attus 
norvegicus 


tRNA eelenocys t eine 
associated protein 


1525 




1056 


U89649 


Chlamydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-lifce 
protein pemphaxin 


1710 


99 


1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1050 


AF224263 


Heterodontue 
francisci 


HoxDS 


742 


83 


1061 


X63417 " 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


Streptomyces 
coelicolor 
A3 (2) 


hypothetical protein 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDUL-10). 


2547 


100 


1064 


AF263614 


Homo sapiens 


acetyl-CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 {PID:g2984292) 


462 


98 


1067 


Y18930 


Sulfolobus 
solfataricus 


hypothetical protein 


162 


29 " 


1068 


R65969 


Homo 

sapiens T98G 


Glioblastoma-derived 
polypeptide. 


887 


100 


1069 


Y07964 


Homo sapiens 


Human secreted protein 
fragment 


863 


$6 


1070 


AF177476 


Rattus 
norvegicus 


CDX5 activator-binding 
protein 


1995 


86 


1071 


AF24550S 


Homo sapiens 


adlican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucogidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Hunian secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 1 


Homo sapiens 


p 70 


380 


28 


107* 


Vl339i " 


Homo sapiens 


Amino acid sequence of 


1271 


91 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTI6N 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








protein PR0328. 






1076 


AF161457 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapiens 


Human carbohydrate- associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


66 


1079 


AL132965 


Arabidopsis 
tha liana 


putative WD-40 repeat-protein 


286 


29 


10S6 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
protein 


579 


100 


1032 


AF016416 


Caenorhabdi t 
is elegans 


F29A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


Mus musculus 


unnamed protein product 


151 


44 


1035 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003. 


202 


97 


1086 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 
human RNA-associated 
protein. 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone H?1056"3. 


613 


100 


1090 


AK023 982 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein." 


606 


100 


1093 ■" 


034 973 


Mus musculus 


protein tyrosine phosphatase - 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


'522 




1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-S3 
SEQ ID NO: 53. 


1029 


99 


1096 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


98 


1097 


AF161455 


Homo sapiens 


HSPC337 


742 


98 


1098 


U80029 


Caenorhabdi t 
is elegans 


similar to thiorcdoxin 


242 


39 


1099 


AJ005D66 


Homo sapiens 


Sqv-7-like protein 


1321 


99 


1100 


AJ005866 


Homo sapiens 


Sqv-7-liJce protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sqv-7-liJce protein 


891 


99 


1102 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1016 


99 


1103 


AL110244 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


1105 


AL031010 


Homo sapiens 


dJ422F24,l (PUTATIVE novel 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 


U28016 


Mus musculus 


parathion hydrolase 
(phosphodiesterase) -related 
protein 


1624 


87 


1107 


AJ278150 


Homo sapiens 


putative lipid kinase 


2207 | 


99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814. 


495 


98 


1109 
11in 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


"834 


54 


1110 


Y2B921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y2B921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








ID NO: 8120. 






1115 


Ar 2294 3 9 


Mus musculus 


zinc finger protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


1.40357 


Homo sapiens 


thyroid receptor interactor 


4 64 


85 " 


1118 


A12155 


Homo sapiens 


Human X5L cDNA. 


" 1673 


100 


1119 


AL161542 


Arabidopsis 
thaliana 


i some rase like protein 


607 


53 


1120 


AL023754 


Homo sapiens 


dJ272L16.1 {Rat 
Ca2+/Calmodulin dependent 
Protein Kinase LIKE protein) 


2341 


98 


1121 


YS7901 


Homo sapiens 


Human transmembrane protein 
ETMPN-25. 


321 


36 


1122 


214 122 


Xenopus 
laevis 


XLCL2 


455 


77 


1123 


AP225418 


Homo sapiens 


lipase 


1531 


97 


1124 


YO^SIB 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035690 


Homo sapiens 


dJ202I2i.i (novel protein) 


952 


100 


1126 


AJ000217 


Homo sapiens 


CtlC2 


1266 


99 


1127 


AB030S05 


Mus musculus 


UBE-lc2 


1069 


79 


1128 


Y733^5" 


Homo sapiens 


HTRM clone 142783 8 protein 
sequence. 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophilin- type peptidyl 
prolyl cis/trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL023553 


Homo sapiens 


dJ347H13.4 (novel protein) 


557 


100 


1131 


Y9194S 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1408 


100 


1132 


Z68197 


Schizosaccha 

rbmyces 

pombe 


putative nuclear pore protein 


596 


J it 


1133 


Z681$7 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus musculus 


enhancer of polycomb 


264 


41 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


Drosophila 
melanogaster 


clathrin-associated protein 


1254 


7B 


113B 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95 . 


440 


98 


1139 


WB8I04" 


Homo 
sapiens 


A Rab protein designated 
HRABS-2. 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 


1141 


W65026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product. 


3369 


100 


1142 


"¥13402 


Homo sapiens 


A*nino acid sequence of 
protein PRO310 . 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 




Anu.no acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


109* 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


93 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 " " 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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DESCRIPTION 
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SCORE 


% 

IDENTITY 








HEAAR60. 






1151 


AF044201 


Rattue 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphatidic acid 
acyl trans f erase- gamma 1 


1855 


99 


1153 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of hhf» r*nNJX 
DKFZp566A0946, Em:AL050069) J 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


11S5 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence . 


1381 


"97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117. 


607 


99 


1157 


AF112444 


Lupinus 
luteus 


Li -asparaginase 


287 




1158 


AF151848 


Homo sapiens 


CGI -90 protein 


232 


32 


1159 


AJ272267 






2449 


100 


1160 


ABO 01 771 


savianvi 




iy6 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 

O &\/ XU ti\J ilU / • 


74* 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 


746 


B3 


11^3 "" 


AF113 534 


UAiifn oan4 ana 
nUlllU Ddpi cUU 


firx— or/^ procein 


2723 


96 


1164 


AF232226 


T)a ni A vayiin 
VdllXU J.CJ. iU 




191 


41 


1165 


ALUS 5 01 


nyiuo sapiens 


au .li^junip . i \a novel protein 
(translation of the cDNA 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJ1191Hl*.l (A novel protein 

( t" l^annl ai" ^ rtn rtf |-ho r«nMZl 
DXFZp5S6A0946, Em: AL050069) > 


945 


76 


1167 


AF187733 






Oil 


42 


1168 


AB019435 


Homo saoienfi 


^jiuu^jiiux i^asc 


OCT 
731 


55 


1169 


AF064604 


Homo Qanipna 






33 


1170 


Y011<U 


Homo sapiens 


Polypeptide fragment encoded 


1191 


100 


1171 


L03188 


Sac char omyce 
s cerevisi&e 


putative 


180 


22 


1172 


AF1137S1 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


diT1042K10 3 (novel nrot-i»i n\ 




xU U 


1175 


U41278 


Caenorhabdit 
Is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35$17 


Wnmn oani pno 


i"cen recBptor v - axjp.fi a - o — 
alpha region 


284 


83 


1177 


AC012680 


ArabidoriRiR 

• OWJL M Q X D 

thaliana 


putative ptyucAn pnospnatase 
2C; 55455-56414 


*> q 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL09^76-) 


Homo sapiens 


"5J579N16.3 (novel protein 
similar to worm, Arabidopsia 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
Is elegans 


similar to ATP synthase B 
chain 


496 


55 ~ 


1181 


Y11710 


Homo sapiens 


collaaen tvoe XTV 


104 8 


97 


1182 


X82240 


Homo 
sapiens] 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT- 1994 
Human TCL-1 
polypeptide. 


T cell leukemia /lymphoma 1 


617 


100 i 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






[Homo 
sapiens 








1183 


U42B41 


Caenorhabdit 
is elegans 


short region oj; weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth- associated protein 


130 


i 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


"636 

D J D 


Xvv 


1188 


AF217544 


Xenopus 
laevis 


ornithine ctecarboxylase-2 


1459 




1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurit e outgrowth) 


182 


J j 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ri bo so trial protein S6 
modification protein (rimK) 


268 


Jl 


1192 


AF154831 


Rattus 
norvegicus 


PV-1 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


7Xa 


J.UU 


1194 


AF026530 


Rattus 
norvegicus 


stathnun-like-protein splice 
variant RB3 1 1 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


V^fliol a TT Vice's n nnrt i nrr 

homo log r-vps33a 




96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule/ 
PRG3 protein. 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD- 017 protein 




A *7 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 

^iiuapiiaLxuyi byllLIldBB 

(GB:Z28295) 


460 


39 


1199 


AF201954 


Homo sapiens 


DC12 


AOS J* 


88 


1200 


AL031775 


Homo sapiens 


diT3 0M3 2 (nnvA T r»i»r»t-*»i' n 

similar to C . elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovia aries 


BIIIR4 high-sulfur keratin 


464 


82 


1202 


Z85986 


Homo sapiens 


aJTOSKll 1 [flimilflr f r> vtaci a ** 

suppressor protein SRP40) 




~7C 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


}erky 


223* 


76" 


1205 


AB062327 


Homo sapiens 


K*AA0329 


X31 




1206 


AB019233 


Arabidopsis 
thaliana 


ubi qu i non e /menaqu i none 

biosynthesis 

methyl transferase- like 


762 


56 


1207 


AL136307 


Homo sapiens 


dJ380B8 2 [Neuritin a 
protein which promotes 
neurits outgrowth) 


1 tz 


100 ' 


1208 . 


AF2079B9 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


OJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


U21549 


Mus musculus 


Ac3 9/physophilin 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted nrotein 
encoded by gene No. 12. 


1267 


J.UU 


1212 


AF117814 


Mus musculus 


odd-skipped related 1 protein 


945 




1213 


AF2 77233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


raeiosis-specif ic nuclear 
structural protein 1 


19S0 


77 


1215 


003022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO* 7103 . 


590 


100 


1216" 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- - 
WATERMAN 
SCORE 


T nFKTTTTV 






Is elegans 


protein (Swiss Prot accession 
yk677hll.5 comes from this 
gene 






1217 


249703 


Saccharorayce 
s cerevisiae 


unknown 


134 


22 


1218 


AC013430 


Arabidopsis 
thaliana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


' 965 


sa 


1221 


ALl*381S 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 


Horao sapiens 


zinc finger protein NY -REN- 21 
antigen 


2261 


166 


1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma -6 subunit 


356 


100 


1224 


Y73364 


Horao sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


122S 


AL050170 


Horao sapiens 


hypothetical protein 


714 


100 


1226 • 


X64002 


Horao sapiens 


RAP74 


2661 


99 


""1227 


X04085 


Homo sapiens 


catalase 


284G 


100 


1228 


AJ005620 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97S7l 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L0B239 


Homo sapiens 


located at OATL1 


2274 


i net 

1UU 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdit 
is elegans 


contains Rimi lari hu h« 
TR :O04595 


744 


3 1 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C (GB.-U20162) 


JO / 


33 


1236 


Y18101 


Mus musculus 


macrophage actin-associated- 

tyrosine-phosphorylated 

protein 


1559 


87 


1237 


AB04264S 


Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 


- 100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


"1242 


AL03S602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X764S3 


Gallus 
gallus 


Yes-associated protein 
(65kDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


1246 
1247 


A*276003 


Homo sapiens 


GAR1 protein 


1216 


100 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34 . 


1369 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyl transfers 
se; similar to Q0753 7 
(PID:gll71989) 


957 


100 


1249 


AF199597 " 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


"1250 


Y13148 


Rattus 
norvegicus 


PAG60B 


1350 


88 


1251 


M24S52 


Rattus 
norvegicus 


neuron- specif ic protein PEP- 
19 


124 


46 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


1 " 
IDENTITY 


1252 


AF146738 


Rattus 
norvegicus 


testis specific protein 


771 


83 


12S3 


G02725 


" Homo sapiens 


Human secreted protein, SEQ 
ID NO: 68C6. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiqu it in -conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC4H95 1 


831 


78 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl-tRNA 
trans formylase 


1556 


88 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214. 


"2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RFP transforming 
protein; similar to P14373 
(PID:gl32517) 


" 1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


' 4^9 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR (1 is 
1st base in codon) (561 is 
3rd base in codon) 


964 


100 


1262 


X15443 


Rattus sp. 


gamma-glutamyl transpeptidase 
(AA 1-568) 


697 


32 


1263 


AF173871 


Mus musculus 


neuronal PAS 3 


977 


94 


"12^4 


AF178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein-1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence . 


1622 


1C0 


1267 


AF061346 


Mus muoculus 


Edpl protein 


1077 


64 J 


1268 


U97006 


Caenorhabdit 
is elegans 


C13F10.4 gene product 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Kab37 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 


55 


1272 


AP201933 


Homo sapiens 


DC11 


650 


100 i 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AL021710 


Arabidopsis 
thaliar.a 


putative protein 


348 


49 


1275 


AC004449 


Homo sapiens 


R33683 3 


556 


100 


1276 


Y8629S 


Homo sapiens 


Human secreted protein 
HL2AG87, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9). 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


478 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y48610 


Homo sapiens 


Human breast tumour- 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidopsis 
thaliana 


Similar to AIG1 protein 


406 


35 


1283 


AK024432 


Homo sapiens 


FLJ00022 protein 


403 


35 


1284 


W961S3 


Homo sapiens 


Human FADD- interacting 
protein (FIP) . 


1825 


81 


1265 


AJ001019 


Homo sapiens 


ring finger protein 


1301 


i6o 


1286 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC0Q6033 


Homo 
sapiens 


similar to MLN 64; similar to 

138027 (PID:g2135214) 


1195 


100 | 


1269 


AC00^033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PlD:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


» — 


54 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1291 


Z73424 


Caenorhabdit 
is elegans 


C44B9.1 


23S 


36 


1292 


Y94871 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF130425 


Homo sapiens 


retinoblastoma-associated 
protein RAP140 


489 


29 


1294 


G03856 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937. 


538 


99 


1295 


AF133^70 


Mus musculus 


ARL-6 interacting protein- 2 


"367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


100 


1297 


X57560 


Escherichia 
coli 


pspE protein 


53* 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine -rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by c. elegans cDNA 
yk61fl,3; coded for by C. 
yki09h8.5 


324 


29 


1300 


AB024523 


Homo sapiens 


basic kruppel like factor 


1206 


100 


1301 


X55989 


Homo sapiens 


eosinophil cat ionic- related 
protein 


737 


99 


1302 


AF007151 ' 


Homo sapiens 


unknown 


1481 


100 


1303 


X52904 


•Escherichia 
coli 


open reading frame (AA 1-^5) 


359 


100 


1364 


U1957? 


Escherichia 
coli 


galactonate dehydratase 


242 


93 


1305 


AF266508 


Mus musculus 


NELF protein 


1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


"932 


100 


1307 


U58750 


Caenorhabdit 
is elegans 


similar to the mitochondrial 
carrier family 


365 


54 


1308 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 ! 


1309 


AL078593 


Homo sapiens 


dJ21QBl.l (KIAA0680) 


267 


34 


1310 


X82693 


Homo sapiens 


E48 antigen 


620 


96 


1311 


Z82263 


Caenorhabdi t 
is elegans 


C47A4.1 


283 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 


Y41763 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


1636 


100 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


100 


1315 


AP0533S6 " 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


97 


1316 


Y66f>95 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 " 


Gallus 
gallus 


SAPK. interacting protein 


2442 


89 


1318 


AF153127 " 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 
1320 


AF153127 
X56932 


Gallus 
gallus 

Homo sapiens 


SAPK interacting protein 
23 kD highly basic protein 


1651 


86 


1321 


AF17460S 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 P- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 j 


1044 
467 


100 

70 " 


1322 


M61732 


Trypanosoma 
cru2i 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 1 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


description/ " ' 


SMITH - 
WATERMAN 


IDENTITY 






tctrovitus 








1324 


AL138655 


Arabidopsis 
t*ha1 i ana 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsis 
thai i ana 


nutative nrotpin 


94£~~ 




1326 


AL133215 


Homo sapiens 


bA108L7.2 {novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 * 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence * 


785 




1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MiLl protein 


1936 


100 


1331 
1332 ' " 


W87772 


W^TYl^ cani Arte 


Human serum glucocorticoid** 
regulated kinase (H-SGK2) 
polypeptide. 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc-finger protein ZBRKl 


411 


91 


1334 


282271 


caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KI?4 comes from 
this gene 


578 


44 


1335 


AE000810 


Methanobacte 
num 

thermoautotr 


conserved protein 


290 


43 


23 36 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 


1019 


91 


1337 


AB027003 


M\ i tx rni \ cj f*»i 1 ] net 


protein phosphatase 


3 78 


84 


1338 


U648S6 


Caenorhabdit 
is elegans 


weak similarity to TPR 

uQZilclAilS 


215 


46 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 




Homo sapiens 


mt-11 protein 


204 


89 


1341 


AC011914 


Arabidopsis 

fr"l*i3» 1 ^ n o 
LXlcii ldiid 


putative mutT protein; 68398- 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


ACO06963 


Homo sapiens 


similar to Kelch proteins; 
similar to baa / ? o// 

\ flu ,y4oDUa | i'i / 


894 


3S 


1345 


AF2574« 


Homo sapiens 


N-acetylneurarainic acid 
phosphate synthase 


1880 


99 


1346 


Y25B96 


nwuio sapiens 


Human secreted protein 
fragment encoded from gene 
64 . 


114 8 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2-like 


1664 


58 


1348 


AF161S4 8 


Homo 93Diens 


HSPCA6"} 


T fti a 
iUXO 




1349 


W78128 


Homo saoien*; 


encoded by gene 3 clone 
HOSBI96 . 


1117 
111/ 


1UU 


1351 


G02144 


Homo sapiens 


Human Rprrp.t" pri nrof pin ocn 
ID NO: 6225. 


418 


inn 


1352 


D90B69 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


^13 


100 


1354 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1_CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


ZNF234 


3869 


100 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 

i\ Ul'ioC t\ 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 








cjLcniciit. uinuin^ aua u<z L d 

transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


248475 




y x uLUKaiiasc rey uiacor 


2682 


97 


1364 


AF195764 


Homo sapiens 


megakaryocyte- enhanced gene 

LI ailoULipt J. piULCl.l, Huu X 1 

protein 


2055 


99 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 




Do X 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C. elegans 


2581 


99 


1368 


Y34124 


Homo 

eani one 


Human potassium channel 
i\+xinovxj ■ 


1342 


100 


1369 




Unmn eani ori c 
juji l kj oa^xcub 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 


VtaG 




429 


45 


1371 


X05562 


noiuo bapxclis 


alpha-2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
xn c OU.CJIJ / 


5908 


99 


1372 


Z98048 


Wr^nv» eani one 


a>j*i\jQN4j . Q inovex uiidu aomain 


1296 


99 


13 73 


AJ154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U202B6 


Rat tus 
norvegicus 


x^nijLiici aosoc lauea poxypepuxQe 
1C 


1567 


69 


1375 


US3445 


Homo sapiens 


D0C1 


1645 


46 


1376 




Homo 
sapiens 


oAjy3Ji&.i (zinc ringer 
protein JJa vkux jxj j 


250 


60 


1377 


JTk*— u v J *-* Q 


numu od^> JL 


lubooo^i, partial cus 


1126 


100 


1378 


U35113 


TJnmo can 4 pme 


metas tasis - associated gene 


1823 


69 


1379 


L153I3 


Caenorhabdit 


putative 


858 


58 


1380 


Y25756 




nuiDda secictcQ protein 

#=» r*i f i~i rl o rl f mm nana / £ 

cutuutju jlj.uui y die id . 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


"ANKHZW 


959 


97 


1383 


AF237676 


Mus musculue 


G beta- like protein GBL 


1721 


96 


13 84 


nf £J / O / D 


Mus musculus 


G beta-like protein GBL 


1043 


70 


13 85 


Y58793 


Homo sapiens 


Human calcium regulatory 

piDLcin LbaBU" X ( 


715 


100 


1386 


AF212162 






10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004S90 




similar to zinc finger 
proteins; similar to BAA24380 

?NUDj1D WUOJXO Uj"ULi w lJ3b 
2 7 - APT? - 1 Q TRP-1 nrnt-o "i n 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 • 


AC035150 


Hfimo ft f^n T priQ 


iitiger protein 


3459 


100 


1391 


AF297B94 


Homo sapiens 


PIST ' " 


1410 


97 


1392 


AF282265 




xuiicx cen t xouieire protein 
INCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 


45B4 


99 


"1394 


AF076249 


Komo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 


G02224 




xiuniciii beuLctcQ proLein, i>*-U 
ID NO» 63 05 


299 


75 


1396 


AC004809 


Arabidopsis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


65 


1399 


AL133396 


Homo 
sapiens 


dJl068H6,4 (prion protein 
like protein doppel) 


962 


100 


1400 


¥48611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
a cerevisiae 


putative HMG box 


164 


27 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBBR 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1403 


Y79222 


Homo 
sapiens 


Human transferase TRNSPS-14. 


2842 


100 


1404 


X81058 


Mus musculus 


tex261 


1010 


99 


1405 


AB012O84 


Mus musculus 


ITM 


194 


29 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


Rattus 
rattus 


PTB-like protein 


2684 


99 


1408 


X75760 


Drosophila 
melanogaster 


LRR4 7 


364 


29 


1409 


U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


AC005578 


Homo sapiens 


P20887_l / partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein. 


360 


100 


1412 


X01563 


Escherichia 
coli 


L5 (rplE) (aa 1-179) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


"141* 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L - kynurenine / alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 




1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide-binding 
protein be,ta 5 


2179 


76 


1420 


AU162458 


Homo sani *»n<; 


ij/i** ojuiu . o vA.xMM.xxfo i novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ78S) amino 

QArninnra cun TH XTrt . 1 f\ Q 

bcau acquence o&Sc HU : JUo , 


152 


29 


1422 


Y94923 


Homo sapiens 


Human c^rrohori nmt-a j n ^1 

nuwctii. becfeucu procsin clone 
QS14 3 Orotein secruence SEO 
ID NO: 52. 


4039 


99 


1423 " 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 


_L U / 4 0 


sir 


1424 


Y48S17 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208848 


Homo sapiens 


BM-00* 


853 


79 


1427 


AF112886 


Bos taurus 


differentiation enhancing 
factor 1 


4693" 


95 " ' 


1428 


041387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161S34 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AFli5043 


Mus musculus 


bisphosphate 3 1 -nucleotidase 


275 




1431 


Y6671B 


Homo 
sapiens 


Membrane- bound protein 
PR01106. 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044£6'0 


Mus musculus 


Gliacoiin 


192 


34 


1434 


R99900 


Homo sapiens 


NTII-l nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1-phosphate 
synthase Al 


29"ul~" " 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging lntegrator-3 j 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human Becreted protein 
encoded from gene 1 . 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucoiipidin ; 


628 


97 


144 0 


AF21913 8 


Homo sapiens 


GGA3 long 1 so form 


3083 


100 


1441 


AF21913 8 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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TABLE 2 



SEQ 
XD 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

WATFRMAN 

SCORE 


% 

TnifMTTTV 


1442 


AB039669 


Homo sapiens 


ALKX3 


1944 


100 


1443 


AF2377H 


Drosophila 
melanogaster 


Diablo 


191 


27 


1444 


AJ011896 


Homo sapiens 


Nafi beta protein 


439 


39 


1445 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


9'd" 


1446 


AF214114 


Homo sapiens 


breast carcinoma-associated 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


1448 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 j 


1449 


AF155112 


Homo sapiens 


NY- REN- 50 antigen 


1184 


89 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SEQ tD NO: 48. 


985 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2 -binding protein 


6 88 


P f 


1452 


AF1072 03 


Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


Z386ii 


Mus mus cuius 


DMR-N9 


882 


56 


1454 


X90568 


Homo oaoiens 


Protein g@rniAnr*p anrJ 

annotation available soon via 
LAB E I T@EMBL - Heide 1 be rg . DE 


31V 


28 


1455 


AL035409 


Homo sapiens 


dJ564Mll.3 (similar to 
sialyltranf erase) 


1356 


100 


1456 


D44480 


Mus mus cuius 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICEl 


478 


45 


1459 


AF242552 


Gallus 
gallus 


ret inovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB0"25^5l 


Mne mi) erii 1 w a 


y £ diiupnx j. in - d 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodie s t e r ase 


2428 


99 


1463 


w \J J ^ I 


nuino sap x ens 


matcn CO cbiS 643979 
(NID:g774333) 


869 


98 


1464 


AC004997 


Homo aarji&rifi 


(NID:g573097) , R19699 
(NID:g774333) 


869 


98 


1465 1 


U32743 


Haemoph i 1 us 
influenzae . 
Rd 




315 


50 


1466 


Y09022 


Homo sapiens 


Not56-like protein 


2342 


100 


146-7 


AC003034 


Homo sapiens 


Hnmo 1 OCT O f" rat Iri Hn^v- 

specific (KS) gene 




99 


1468 ' 


AF071544 


Spinacia 
oleracea 
( 


carboxylase/oxygenase small 
subunit N- methyl transferase I 




Zb 


146~9 


Y57930 " 


Homo sapiens 


Human h ra n bihp mh ya n ^ nml*oi n 
nuuian u l ciuoiiicuujt cnic protein 

HTMPN- 54 . 




100 


1470 


AF032666 


Rattus 
norvegicus 


rsec5 




» 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein-17 (MECHP-17) . 


4^2" 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Riboaomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTSl 


1101 




1475 


Y86^41 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


no " 


1476 


AJ010317 


Fugu 

rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3,- similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116" 


100 


14B0 


U10536 


Pan paniscus 


MHC. class I A 


675 


84 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 

UfiTBDMiWI 
rt*\ I nKrlMiN 

SCORE 


IDENTITY 


1481 


AL078599 ~ 


Homo sapiens 


dJ9Slc£ . 1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086) ) 


1274 


65 


1482 


Z98977 


Schizosaccha 

romyces 

pombe 


putative vacuolar protein 


256 


29 


1483 


AB005662 


Mus ntusculus 


JNK/SAPK-associated protein- 1 


4968 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


1485 


M2787B 


Homo sapiens 


DNA binding protein 


1006 




1486 


Y69161 


Homo sapiens 


Amino acid sequence oi a 
partial protein kinase. 


575 


99 


1487 


X841*6 


Saccharomyce 
s cerevisiae 


ATH1 


341 


29 


1488 


AF038953 


Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk30b3.5; coded for by C. 
elegans cDNA yk30b3.3 


'~G2b 


A 1 


1490 


AE000989 


Archaeoglobu 
a fulgidus 


enoyl-CoA hydra tase (fad- 4) 


533 


*k D 


1491 


M80633 


Rattus 
norvegicus 


adenylyl cyclase type IV 


707 






I /J J4Z 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3S13 


99 


1493 


"V17220 


Homo sapiens 


Human secreted protein (clone 
f j283-ll) . 


4 62 


37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574 . 


1371 


100 


1496 
1497 


AL049^99 
AF037447 


Homo sapiens 
Homo sapiens 


d\7747H23,2 (novel protein) 
ribosomal S6 protein kinase 




100 


1498 


AL445067 


Thermoplasma 
acidophilum 


putative target YPL207W of 
the HAP2 transcriptional 
complex related protein 


2427 
269 


100 


1499 
1500 


AB039947 
AJ277750 


Homo sapiens 
Homo sapiens 


XllL-binding protein 51 
UBASH3A protein 


227 


36 


1501 


AL050333 


Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


3509 
2439 


100 


1502 


AF179896 


Homo sapiens 


TALE horaeobox protein Meis2b 


1140 


100 


1503 


AF17894& " 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 "~ " 


1*04 
"1505 


Y53dflS 

X82494 


Homo sapiens 
Homo sapiens 


Human secreted protein clone 
pn749_8 protein sequence SEQ 
ID NO:16. 
fibulin-2 


1442 
3580 


99 
99 


1506 
1*507 


X98296 " 
AL034548 


Homo sapiens 
Homo sapiens 


ubiquitin hydrolase 
dJ1103G7.6 (novel protein) 


783 
1098 


42 1 
100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus 
protein HT0 08 


1181 


98 


1510 


U6466l 


Caenorhabdit 
is elegans 


Gene probably begins in the i 
next cosraid 


415 


58 


1511 


AL356192 


Neurospora 
crassa 


related to mdmi protein 


196 


29 


1512 
1513 


D17629 
AF168717 


Homo 
sapiens 
Homo sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 
x 009 protein 


1829 
694 


100 
99 


1514 
1515 


AJ243531 
AC003672 


Homo sapiens 
Arabi dops is 
thai i ana 


nM15 protein 

putative C3HC4-type RING zinc 
finger protein 


735 
407 


100 
30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AF00314 0 


Caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


1518 
1*19" 


AB002584 
AL12l7(j4 


Rattus 

norvegicus 

Schizosaccha 


beta- alanine -pyruvate 

aminotransferase 

yeast atpl2 protein precursor 


2238 
270 


82 
30 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






rorayces 
pombe 


homo log 






1520 


AF25S910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 ' 


1522 


Y66634 


Homo 
sapiens 


Membrane -bound protein 
PRO190. 


985" 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC000107 


Arabidopsis 
thaliana 


F17F8.22 


277 


37 


1525 


AF109377 


Mus musculus 


IdlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase -like 
phosphodiesterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJ00012 protein 


611 


100 


1529 


AP154502 


Homo sapiens 


quiescent cell proline 
dipeptidaoe 


679 


100 


1530 


AF205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 


1533 


AF039023 


Homo sapiens 


Ran -OTP binding protein; 
RanBPS 


5707 


99 


1534 


AC00719O 


AraoiGopsis 
thaliana 


F23N19.9 


174 


"37 


153S 


AB027564 


Homo sapiens 


DINB1 


4462 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protean 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3593 


99 


1538 


PiF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 


AF266756 


Homo sapiens 


cphingooine kinase 


2011 


99 


1540 


Z48804 


Homo sapiens 


OA1 


2238 


100 


1541 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Scores20.6, E-value=l . 9e-05, 
N=l 


379 


42 


1542 


Y711S9 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 


9415 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB015330 


Homo sapiens 


HR1HFB2007 


631 


50 


1S45 


AF198487 


Homo sapiens 


transcription factor LDP-lb 


2822 


100 


1546 


AF016417 


Caenorhabdit 
is elegans 


Similar to BZIP transcription 
factor 


516 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


iioS 


100 


154 8 


AB035495 


Carassius 
aura t us 


ubiquitin-activating enzyme 
El 


836 


42 


1549 


AL021707 


Homo sapiens 


dJ508I15.4 (KIAA0668) 


3688 


100 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


292 


42 " 


1551 


AF145615 


urosopnila 
raelanogaeter 


BCDNA.GH033 77 


822 


44 


1552 


AL157734 


Schizosaccha 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079S2 7 


Mus musculus 


IhRS 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


99 


1556 


AF116553 


urosopnila 
melanogaster 


antennal-specif ic short -chain | 
dehydrogenase/reductase I 


277 


32 


1557 


Y71056" 


Homo sapiens 


Human membrane transport j 


1975 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

nnl kKfU-UN 

SCORE 


t 

xunct nil 








protein, MTRP-1. 






1558 


Y7105£ 


Homo sapiens 


Human membrane transport 
protein, MTRP-l. 


1975 




1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-i. 


1894 


97 


1560 


AF092050 


Mus rausculus 


beta-1, 3-N- 

ace tylglucosaminyl t rans f erase 


262 




1561 


AI*109827 


Homo sapiens 


dtT309K20 1 ( arm unmal nrnhpin 

ACR55 (similar to rat sperm 
antigen 4 (SPAG4) ) ) 


X D U f 


q •? 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 

omnia*, ts— uaupn 1 1 Ct t\Ci vu 

proteins) 


3015 


100 


1564 


AC002400 


Homo sanipnR 


to Ubicruitin bindinrr pn^vmp 




100 


1565 


AC0053O6 


Homo sapiens 


R27216 1 


919 


82 


1566 


AF000195 


Caenorhabdi t 
is elegans 


Contains simi lari t*v v r» Pf^im 
domain: PF00169 (PH) , 
Score«20.6, E-value=l.9e-05, 
N=l 


550 


*D 


1567 


AB033281 


Homo 
sapiens 


F-box and WD-repeats protein 
beta-TRCP2 isoforra C 


2879 


100 


lS6"8 


D49473 


Mue mus cuius 


truncated form of Soxl7 


1047 


78 


1569 


AK02527O 




uiiiic±iiiea protein prOQUCL 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C rau 


4797 


99 


1571 


Af 1* J / X 


Homo sapiens 


"antiYb i 


2388 


100 


15 72 




jjr os opn 1 x a. 


ut?J.b4 4b gene product 


180 


31 


1573 


AF074603 


Streptomyces 
subsp . 

crriseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdi t 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64B78 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


■fVT Z. 3 i t XX 


ucuoopn J. X d 

melanogaster 


UX aXlXO 


421 


54 


™5"T8 


G00975 




Human secreted protein, SEQ 
Tr> no- Rn*?fi 


480 


100 


1579 


AF24 8744 


Cryptosporid 
1 um r>3 T*vnm 


thrombospondin- related 

auucoAvc t'luuciii 


123 


33 


1580 ™ 


At, 121782 


Homo sapiens 


uu jo jiii • z \iiwt;j. protein 

(translation of cDNA 
Em:AK000219> ) 


obJ 


i nn 

1UU 


1581 


AF041853 


Homo sapiens 


icinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein 0IP5 


1198 


100 


1583 


AEO01BO3 


Thermotoga 
maritima 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane protein FLRT1 


3494 


99 


158^ 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


1587 


X79440 


Homo sap i ens 


NADP+- dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


225535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 



184 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


I SPECIES 




onL Lrl- 
SCORE 


T C 1 






porabe 








1595 1 


W18324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb649 3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx25 


1408 


99 


1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStafSO 


2305 


100 


1601 


Y00876" 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA-interacting protein 3 


2821 


99 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 


S'9 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


.1605 


AF185S76 


Mus musculus 


POZ/zinc finger transcription 
factor ODA-8 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


idb 


1607 


A12142 


synthetic 
construct 


1FN( -pseudo -omega 2 


BOO 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1B68 


i nn 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 




1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


37^5 


100 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


OA** 


o a 


1613 


AC004481 


Arabidopsis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 




97 


1616 


AJ010750 


Rattus 
norvegicus 


aDODtosis related orotein-i 
(CIPAR-1) 




62 


1617 


XS4079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


V66"6"78 


Homo 
sapiens 


Membrane- bound protein 
PRO1009. 


967 


1 nn 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-SSlcDa-associated orotein 


4646 


98 


1622 


X64177 


Homo sapiens 


metal lot hi one in 


380 


i nn 


1623 


AE001045 


Archaeog lobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AL355013 


Schizosaccha 

romyces 

pombe 


mitochondrial carripr nrv>>*/» S n 


4 03 




1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PR01198. 


1184 


100 


1625 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


Y359S4 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO, 
203. 


756 


100 


16^28 


AL031775 


Homo sapiens 


dJ30M3.2 {novel protein) 


470 


100 


1629 


AF132484 


Mus musculus 


unknown 


286" 


68 


1630 


AFQ27096 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


76" 3 


100 


1633 


AJb6l874 


Homo sapiens 


orr 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 



185 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATKRMAN 
SCORE 


% 

IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E intearase 


411 


~5n 


1636 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 

Ve ft 1 derived nrnhpi n 


1126 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


tiutativp nhn qt">Vi ahacp cmhimir 


1 OA Q 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
ID NOT90. 


1320 


100 


1640 


AF235030 


Homo sapiens 




/DO 


99 


1641 


AF233288 


Drosophila 
melanogaster 


WDS 


358 


26 


1642 


M19351 


M\ifi mtififiil lie 

"UO IIIUOUUIUD 


i mrniincu^rl oVmi Tin Vi o^Tn/ 
j.unuuijtjy j.tjijux iii neavy cnain 


14 5 


34 


1643 


Y70452 


Homo sapiens 


Unman mpmHfano rhannal 

protein- 2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat- containing F-box 

pjLuucjiu ronj 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-1 


4456 


99 


1647 


mc*3 inn 

lib J ±0 U 


Homo sapiens 


threonyl- tRNA synthetase 


1040 


61 




Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
olMJ 1LJ NU . ny . 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
iiyduu icione Jin) , 


4137 


l60 


1650 


AC007136 


_j , 

Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Unmn CAnl pne 


Pnoi CD 


44 64 


99 


1652 


AL161576 


Arabidopsis 

t* hal A aria 


putative protein 


1341 


48 


1653 


AC005313 "" 


thaliana 


putatlVB CaXlUQQUllD 


288 


28 


1654 


ALQ3142 8 


Homo sapiens 


uuio^u?! ji iAi./v\uoui proLeini 


3526 


100 


1655 


AL031428 


Homo sapiens 


GU184J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
in discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 

ct -r^i ■( on a 
C3 a^)J.C no 


Human regulatory protein 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


A>*aH4 Hnnc i a 
i- auiuupo 10 

thaliana 


uDiuuitin-speciiic procease 


137 


35 


ifi<Sd 


AL078627 


romyce3 

oomhe 


actin-iiKB procein; \<e accin 
domains! 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


protein beta subunit 4 


loll 


100 




AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


"26* 


1666 


AF1773B5 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191 1 """" ' 


IDol 


47 


1668 


S67513 
- 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bi/91, horse 
brain, field 
isolate. 
Peptide, 370 


P40 


397 


"43 



186 
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TABLE 2 



SEQ 
ID 
NO: 


Accession 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






aa 









1669 


Z99753 


Schizosaccha 

romyces 

pombe 


putative N0Ll-K0P2-sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


" 1185 


54 


1672 


AF174482 


Homo sapiens 


polycomb 3 


2005 


99 


1673 


Y51B46 . 


Homo sapiens 


Human 18.1 homolog protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP35 


" lS2 " 


29 


1675 


Y9436"7 


Homo 
sapiens 


Human protein clone HP10563. 


i09 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


"1*80 


91 


1678 


AF1631S1 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AF1631S1 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein 


1349 


100 


16B1 


AP019236 1 


Dictyosteliu 
m discoideum 


TipD 


613 


34 


1682 


AJ243459 


Iieishmania 
major 


proteophosphoglycan 


153 


"26 


1683 


Z69369 


Schizosaccha 

romyces 

pombe 


putative GTP-binding protein 


560 


46 


1684 


X94910 


Homo sapiens 


ERp2 8 


1334 


100 


1685 


AF286475 


Takifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator- like protein 


196 


19 | 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 
1688 


AJ275986" 
AJ275986 


Homo sapiens 
Homo sapiens 


transcription factor 
transcription factor 


2958 


100 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 


1886 
138 


88 
43 


1690 
1691 


AF240463 
ACT272078 


Rattus 

iiut vcyi CUS 

Homo sapiens 


LlSl-interacting protein 
NUDE1 

APOBEC-1 stimulating protein 


1383. 
1256 


83 
58 


1692 
1693 


AJ272079 
AF177942 


Homo sapiens 

Xenopus 

laevis 


APOBEC-1 stimulating protein 
katanin p60 


1336 
1664 


60 
66 


1694 


AF26'3539 




argmine N-metnyltransf erase 


1774 


100 


1695 
1696 


A^522^89 
AK000193 


Homo 
sapiens 
Homo sapiens 


protein arginine N- 

methyl transferase 1 -variant 2 

unnamed protein product 


1182 


81 


"1697 " 
1698 " 


AB041035 


Homo sapiens 


kidney superoxide -producing 
NADPH oxidase 


1060 " 
3122 


100 
100 


1699 


AB041035 
AF025772 


Homo sapiens 
Homo sapiens 


kidney superoxide -producing 
NADPH oxidase 


2181 


100 


1700 

1701 
1702 


Y44676 

AX022407 
AB024574 


Homo sapiens 

Coitm naniona 

Homo sapiens 


C2H2 zinc finger protein 
Human ARF-Related Protein-1 
(HARP-l) . 

unnamed protein product 
GTP-binding like protein 2 


488 

938 

315 
1172 


54 
97 

98 
100 


1703 
1704 
1705 


AF055078 
AF198092 
AE003573 


Homo sapiens 
Kus musculus 
Drosophila 
melanogaster 


zinc finger protein 42 
RP42 

CG12474 gene product 


421 

1057 

161 


52 
77 
33 


1706 
1707 


AB036345 
Y55927 - - 


Drosophila 
melanogaster 
Homo sapiens 


aquaporin 


164 


"24 


"1708 
1709 


U27121 

AX^91710 


Danio rerio 
Arabidopsis 


Human STLK2 protein. 
G12 

putative protein 


2146 

212 

505 


100 

47 

50 



187 



1 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

XL/iMN 1 11 X 






thaliana 








1710 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


Mus rausculus 




456l 


85 


1712 


AJ01111B 


Mus mus cuius 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 


AF2S5303 ' 


Homo 
sapiens 


acid binding protein 


4416 


- QQ 


1714 


AF255303 


Homo 
sapiens 


membrane -associated nucleic 
acid binding protein 


2960 


" ioo 


1715 


U68227 


Rattus 
norvegicus 


Ras- related protein 


511 


51 


1716 


AP168795 


Rattus 
norvegicus 


schlafen-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO-l-specific protease 


5804 


99 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


JLUU 


1719 


AB029333 


Halocynthia 
roretzi 


HrPBT-1 


1069 


46 


1720 


AF071317 


Mus rausculus 


COPS complex subunit 7b 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1 cm 
JLOO J. 


99 


1722 


G01982 


Homo sapiens 


ID NO; 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdi t 
is elegans 


similar to Uncharacterized 
protein family UPF0034, 


82S 


4 1 


1724 


G01972 


Homo sapiens 


ID NO: 6053. 


cat! 

dob 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


1726 


AP2S5443 




CGI-201 nrotein 


a 1 q *? 


99 


1727 


API 834 2 6 


Homo sapiens 


HT004 protein 


l p 1 n 
x oxu 


99 


1728 


D10884 


Bos taurus 


neurocalcin 


1002 


99 


1729 


Z18529 


Gallus 
gallus 




14 11 


84 


1730 


Z73423 


Caenorhabdi t 
is elegans 


CDNA EST EMBL:Z14 908 comes 
from this gene-cDNA EST this 
gene 


233 


4i 


1732 


AFOSQB91 


Homo sapiens 


PR66105 


470 


30 


17^3 - 


A^277724 


Homo sapiens 


histone deacetylase 8 


Z UX D 


100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus musculus 




J J J 1 


94 


1736 
1737 


AF096709 
AF195120 


Drosophila 
virilis 
Homo sapiens 


failed axon connections 
protein 

dynactin p62 subunit 


276 


32 


1738 


KL5314 


is elegans 


cantainQ Q^m^ 1 aH +r\ D^am 
family PF01772 N=l 


2417 


99 
37 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidylinositol specific 


134 


27 


1740 


AL031658 " 


Homo sapiens 


dJ310O11 A fnnvpl nrnfo ^ r\ 

similar to predicted C. 
elecrans an C infcesfcinali« 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


protein sequence, SEQ ID NO. 
173 . 




99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 


1745 | 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


1746 
1747 


Y9$372 
Y94294 


Homo sapiens 
Homo sapiens 


Human PR01430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 
Human coenzyme A-utilising 


1332 
842 


99 
100 
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TABLE 2 



SEQ 
ID 
NO: 



1748 



1749 



ACCESSION 
NUMBER 



AK02443T 



AE0OO877 



Homo sapiens 



Methanobacte 
rium 

thermoautotr 
ophicum 



DESCRIPTION 



enzyme CoAEN-2 . 



FLJ00026 protein 



conserved protein 



SMITH- 
WATERMAN 
SCORE 



1619 



231 



IDENTITY 



100 



36 



17S1 
1752 



1753 



1754 



1755 



1756 



1757 



1767 



1768 
1769" 



Y15067 
AF251038 



Drosophila 
melanogaster 



Abnormal x segregation 



193 



Homo sapiens 



2NF232 



AC003093 



Homo sapiens 



Homo sapiens 



X69089 



AL049795 



Homo sapiens 



GAP- like protein"" " 
OXYSTEROL-BiNDING PROTEIN; 
45% similarity to P220S9 
(PID:gl29308) 



889 



822 
352 



165kD 



AL031393 



Homo sapiens 



Homo sapiens 



protein : 
dJ622L5.3 (novel protelnf" 



5703 



aboToTtT 



Homo sapiens 



CU733D15 
protein) 



"l (zinc-finger 



1039 



2765 



UDP-GalNAc: polypeptide N- — 

acetylgalactosaminyltransfera 
se 

dJlG42Kl0.4 (novel pronein) 



2020 



Homo sapiens 



AC009176 



Arabidcpsis 
thaliana 



AK000647 



Homo sap i en i 



Human secreted protein 
encoded by gene No. 36. 
putative ncuiose-1,5- " — 
bisphosphate 

carboxylase/oxygenase small 
subunit N-methyltransferase I 



145 



TIT 



unnamed protein product 
VNN3 protein 



737 



33 



100" 



100 



57 



99 



"TOO" 



100 



99 




71 



2T 



9T 



1770 



1771 



1772 
1773 



U73522 



Homo sapiens 



U89435 
S70011 



Homo sapiens 



Kus musculus 



AMSH 
unknown 



2665 
1214 



Rattus 



ep. 



tricarboxylate carrier 



829 



1604 



99 



86 



95 



1774 



1775 
1776 



AL035086 
Y99426 



sapiens 



Homo sapiens 



dJ44A20.2 (novel protein) 



AF110330 



AJ26952 9 
ZB1579 



Homo sapiens 



Homo sapiens 



Human PRO1604 (UNQ785) amino 
acid sequen ce SEQ ID NO: 308. 
glutaminase 



2036 



1057 



glycerol 3 -phosphate permease" 



3146 



2787 



100 



100 



100 



1778 



AY007239 



Caenorhabdit 
is elegans 



1779 



AL109608 



Homo sapiens 



1780 



AF254260" 



Schizosaccha 

romyces 

pombe 



cdna EST yk'/bti.s comes from" 
this gene 
monooxygenase X 



232 



oxysceroi-binding protein" 
family 



1781 



L07924 



Homo sapiens 



Mus musculus 



tuftelin 



1782 



AF295773 



1783 



AK024475 



Homo 
sapiens 



guanine nucleotide 
dissociation stimulator 



1784 



1785 



AK024475 
G03933 



Homo sapiens 



ral guanine nucleotide 
dissociation stimulator 



FLJ00068 



Homo sapiens 



Homo sapiens" 



8 protei n 
FJUJ0O068 protein 



1875" 
644 



1729 



2TT 



142 



1786" 



S"82*37 



Homo sapiens 



Human secreted protein, SEQ 

ID NO: 8014. 

ig lambda-like gene/beta^ 



4333~ 



570 



247 



31 



99 



38 



Too" 

SO - 



49 



100 



93 



100 
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TABLE 2 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 


* 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










glucuronidase exon 11 homo log 







TRADOCS: 1 4 1 62S0. 1 (%CT40 1 !. DOC) 



190 
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TABLE 3 



SEQ ID NO: 


NO. 


DESCRIPTION 


RESULTS * 


2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.250e- 
12 157-181 


3 


ppnm no 

irn UUXU7 


1 iK<J£>iINfc£ KINASE 

CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 B.085e- 
13 358-381 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.400e- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
col lag en -binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


6 


HL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


T>tO0O23 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 

i n 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 5.119e- 
09 863-917 


XV 


PR00464 


B- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 $.86"8e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-S18 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.S71e-12 421-434 


18 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 

18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 2B7-329 


21 


BL00487 


tMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 - " 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SEQ ID MO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL0O107A 18.39 3.250e- 
26 302-333 


2£ 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BLOOllSY 11.86 
8.000e-17 1604-1650 
BL00115M 19.19 8 . 130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BLOOllSR 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 9.289B-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.011e-13 435- 
463 BLOOllSK 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BL0011SS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins. 


BL00050A 23.71 9.250e~ 
27 94-127 BL00050B 
14.81 8,125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PFD0756 


ruLstive esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins . 


BL00S57D 17.76 5.0£5e- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 

TWTPPBPTTnM nrwis TXT 

SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00629F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
FR00629D 12.45 3.739e- 
11 276-286 


35 " ■ 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN . 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.45Se-30 137- 
166 


36 1 - 


PD01270 


KECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* " 








PD01270D 24.66 3.700e- 

19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP -43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


38 


BL00412 


Neur omodul in (GAP -43 ) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


proteins. 


10 264-298 


40 


PR00380 


KINESIN HBAVV CHAIN 
SIGNATURE 


PR0038QB 12.64 7.366e- 
14 342-360 PR00380C 

394 PR003BOD 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL0034 5 


Ets-domain probeins. 


BL00345B 21.28 l,000e- 
40 239-290 BL00345A 
13.96 2.452e-14 204- 
223 


45 


15 J-iU U J *t D 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13 . 95 2.4526-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
NiSMBKANr* OUTER. 


DM01551A 15.63 3.53Be- 
26 172-202 DM01551C 
14.62 3.571e-17 232- 
252 DM01551B 8.84 
4.750e-ll 214-226 


47 


PR00B76 


NEMATODE MfiTALLOTri I ONE I N 
SIGNATURE 


PR0087(JB 7.6^ 9. ^Se- 
ll 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


SO 


BL0Q972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL0O972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.0006- 
13 360-375 BLO0972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.82Be-15 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 

XX X I 'X X O O rAUVvOOL' 

S.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.6B2e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
53 0 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762P 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR0C762E 12.07 
2.286e-l5 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


CO 

3D 


rf u u ijx 


Domain present in ZO-1 
and Unc5-like netrin 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 

ctiiU Ut.t^D l.JLJ\C hccitui 

receptors. 


PF00791B 28 .49 2.049e- 

10 100^-111 / 


61 




ANTIBIOTIC TRANSFERASE 
AM 


PD01929E 10.75 9.01Be- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF006S1 15.00 8.714e- 
10 51-64 


72 


DM0 0179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASB I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-l2 334- 
351 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-12 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


Q A 

04 


nnn ft ft* j 

PKuU 014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


rKUU D to 


PI 3 KINASE P05 
REGULATORY SUBUNIT 


PR0067BH 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 4 0 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


nr.o nice 


fucative AMr-Dinaing 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4,0Q0e- 
10 123-154 


96 


BLOO107 j 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RiBITOL 
DEHYDROGENASP FAMTT.Y 
SIGNATURE 


PR00081B 10.38 6.318e- 
ii ^ia—^a£. DDonnoia 

1J ±JH— XHO fK\J\JVO±t\ 

10. S3 2.500e-l2 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
9.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP- DEPENDENT CLP 
PROTEASE ATP-BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.56 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacyl glycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-l3 272-295 
BL00479B 12.57 6.294e- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 5.000e- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
proteins . 


3L00191K 17.38 4.9Sle- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18 39 5 BOOe- 
23 1S6-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins. 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases atp- 
binding region proteins. 


BL00107A 18.39 8.56*0e- 
13 36-67 


119 


PR00S29 


GONADOTROPHS RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- " 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PRO 03 20 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15 R7 7 i?r*. 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6 14 iqtip. 
12 147-157 BL01032H 
11.25 5.680e-U 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8 . 902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PRO099O 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl -CoA-binding 
protein. 


BL00880 17.52 5.57^e- 
26 72-122 


134 


BL00 03 0 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- " 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR0021SC 13.98 6.779e- 
10 475-496 


136 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL0131C 14. 74 2.432e- 
29 71-107 


14 0 


BL00028 


Zanc finger, C2H2 type, 
domain proteins. 


BL00028 lb". 07 7.882e- 
14 214-231 BL00028 
16,07 9.4716-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS *• 








BL00028 16.07 S.SOOe- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BLQ0028 16.07 
8.043e-12 46-63 
BL00028 16.07 6.435e- 
1«£ 1J0-14/ CL00028 
16.07 9.217e-12 270- 

6.192e-ll 242-259 
BL00028 16.07 4.000e- 

J- U 1D0*1 / 3 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 


14* 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
2 INC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


3' 5* -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


BL60632 


Ribosomal protein S4 
proteins. 


BL00632 23.79 5.271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.395e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Act ins proteins. 


BL00406D 12.58 2.547e- 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.6B2e-12 128-183 


160 


D4JUUX O « 


Zinc carboxypeptldases, 
zinc-binding region 1 
proteins. 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 




PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 




Kioosouiai p roc em oid 
proteins. 


BL0G362 24.67 9.700e- 
15 129-172 


169 




uhJW- dok sutbianuly ATP- 
dependent helicases 

pioteins • 


BL00039D 21.67 l.OOOe- 
35 640-686 BL00039A 

1ft AA 1 OCila.1) 

10.44 X.:Jc>4e" , i4 Z12- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


"179 


pooioe* 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 


PDdlu66 19.43 9.4£i*e- 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




"180 


" PR60007 


COMPLEMENT C1Q. DOMAIN 

SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-19 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.8S5e- 
11 238-249 


181 


BL00027 


'Homeobox 1 domain 
proteins . 


BL00027 26.43 9.52£e- 
24 280-323 


182 


BL00027 


'Homeobox' domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


■ Honteobox ' doma x n 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


184 


" BL00027 


'Homeobox' domain 
proteins. 


BL00027 26.43 9.526V 
24 263-306 


188 


PR00929 


AT- HOOK-LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e~ 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.7l4e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BLO0383E 10.35 7.3D0e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL003B3B 

1 I.B3ZC* IX lO / - X o 

BL00383C 10.10 1.750e- 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PF00*tf4 


Octicosapeptide repeat 
proteins. 


PF00564B 24 74 £ 

16 227-278 


194 


PR00503 


BROM0D0MAIN SIGNATURE 


PR00503D 20.81 9.156e- 
15 204-224 

9.96 9.571e-13 170-187 


195 


BL00901 


Cysteine 

synthase/cystathionine 
beta- synthase P- 
phosphate att. 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE" 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosoraal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.252e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(I*DL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.065e-16 65-87 

rftvu*oj,v« XX. J/ 0>.i7O/e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261B 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 ^.143e- 
13 118-173 PF00791C 
20 98 7 £PJOp>-in no 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


rn.yuuu /rt 1?. d. /SJLC— 
19 131-158 PR00007B 

14 16 4 ll^ia-lfl ICQ 

178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BLO0183 


Ub i qu i t in - con j uga t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-51 


213 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins. 


30 43-91 


21b 


BL000^9 


DEAD- box subfamily ATP- 
dependent hclicases 
proteins. 


BL00039D 21.67 1.900e- 

^ Q >t nT ftft ai r\ yi 

*.y 50O"Ol4 dJjUOOJjA 

18.44 1.871e-23 21-60 
11 364-388 BL00039B 
303 


217 


BL0010Q 


Chloramphenicol 
acetyl transferase 
proteins. 


09 GB-106 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


FR00213C 15.94 3.969e- " 
11 199-227 


222 


BL00678 


Trp-Asp i WD.) repeat 
proteins proteins. 


BL00676 9.67 1.947e-09 
144-155 


224 


PR0087^ 


MOLLUSC METALLOTHIONE IN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 " 


BL0Otl36 


Nt-dnaJ domain proteins. 


OUVUOJOD 13 i XI 0.4vU6" 

19 18-39 


226 


BL00636 1 


Nt-dnaJ domain proteins. 


BL00636A 8.07 1.000c- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 
~230 


PR00301 - ' 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G * 
13 7fl 4 300p_1O i/ri _ 
382 


711 


BL0046'0 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A ?fl 67 A 77i B 
20 35-70 BLO0460B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 
12 111-134 RTinndfinn 

^- 111 l.j*t DUUJ4 DWXJ 

16.89 8.773e-ll 140- 
160 


£• JX 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00S47B 10.19 8.522e- 
09 273-287 


233 


BL00292 


cyciins proteins. 


BL00292B 20.31 7.429e- " 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-?ri c i PRfinm cm 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


PR00019 


LEUCINE -RilCH REPEAT* 
SIGNATURE 


PR00019B 11.36 7.300c* 
10 245-2S9 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1.000e-08 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 8.448e-09 j 
67-81 


240 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidir.e and 
deoxycytidylate 
deaminases zinc -binding 
region s. 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 8.043e- 
09 124-134 


246 


BL00246 


x i.aiuiiy proteins . 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
■dU.OZ x.uuue-4U JUb- 
351 BL00246B 13.69 
» .i/oe-jo i.Ub-140 
BL00246A 15.75 2.286e- 

15.56 4.857e-22 150- 
175 


"250 


Pft00927 


ADENiNE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 


BL00674 


AAA-protein family- 
proteins. 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6\045e- 
09 61-88 


255 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002B 15.18 2.800e- 
10 421-435 


258 


PR00O94 


ADENYLATE KINASE 
SIGNATURE 


DDnnno^p io qa *> or»ft« 
rKU X&.Hh Z . ZuUQ- 

18 87-104 PR00094D 

177 PR00094A 10.31 
5.5008-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 179- 
193 


259 


BL00892 


HIT family proteins . 


nT.nnciQ*3A io e cnn Q 
a±j\j u a j 4J\ ±a . ± i a . auue- 

13 60-91 


2<J2 


BL00388 


Proteasome A- type 
oubunits proteins. 


BL00388A 23.14 l.OOOe- 
40 8-S4 BL0038BB 

31 38 3 ftKA«»-33 /rc.i no 

BL00388D 20.71 l.OOOe- 

18.79 8.147e-16 126- 
148 


264 


BL00903 


Cytidine and" 
deoxycytidylate 
deaminases zinc-binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 i.529e- 
09 241-257 


270 


BL0022* 


Intermediate filaments 
proteins . 


BL00226D 19.10 l.OOOe- 
37 362-409 BLO0226B 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23.86 8.<M3e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 S.143e- 
15 96-111 


271 


PD029S2 


KINASE TRANSFERASE 
MULT I GENE FAMI. 


PD02952C 15.76 9.731e- 
_b ^jb-zbb PDQ29S2B 
15.57 5.625e-09 215- 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


274 


BL01027 


Glycoayl hydrolases 


BL01027B 15.34 3.486e- 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00O52 


Ribosomal protein S7 
proteins . 


BL00052A 27.85 fi.OOOe- j 
13 137-184 BL00052B 
15.17 S.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


2B0 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.62Se- 
23 107-125 PR00319C 
13.41 1.000e-21 89-105 
PR00319A 15.27 8.364e- 
21 51-68 PRO0319B 
11.47 8.200e-l9 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 ! 


292 


BLOO^ 


Tropomyosins proteino. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PDO0066 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PD00066 13.92 8.714e- 
12 203-216 


'295 ' 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 5.500e- 
15 322-339 BL00028 
16.07 9.471e-14 4*33- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BLO0028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 S.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 

I.000e-13 182-194 


304 


PP00152 


tRNA synthetases class 
II. 


PF00152D 21.30 8.364e- 

28.03 9.250e-2l 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19. 68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 B.250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 2fc?.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNA'rURfi' 


PR0D454C 11.24 7.80Be- 
09 1167-1186 


308 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPER FAMILY SIGNATURE ■ 


PR00237E 13.03 S.091e- 
13 188-212 PR00237G 
13, dj 7.2076-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-l0 137-159 
PR00237P 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA 'tool vme t& f^mi l\/ Y 
proteins. 


nJUUlo^L. 11.90 /.577e- 
24 315-339 BL00522F 

Id CjO 1 C Att\ 
^ > y v 1 • JiUc Xd *t / U — 

494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E*"19.63 8.615e- 
14 430-460 BL0052PR 
27.30 9.625e-12 267- 
313 


310 


BL00326 


Tropomyosins proteins. 


BL0032^D 8.76 5.235e- 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4.706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 

■51 c 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


Jl3 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF0065I 15.00 5.09-le- 
15 63-76 


317 ■■ 


&LQ1Q20 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 j 
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NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


' Homeobox • domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PRO 0109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 B.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins. 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR004S4 


KTS DOMAIN SIGNATURE 


PR0d454C 11.24 T.8b8e-" 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins . 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.29Be- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 S.SOOe- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


Bioneo 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


ZINC- FINGER METAL - 
BINDING NU. 


30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 | 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12-27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR06109B 12.27 4.7$4e- 
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ACCESSION 
NO. 


Description " 


RESULTS * 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL011B7 


Calcium-binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 

o.725e-09 231- 
247 BL01187A 9.98 

7 firiflp.nQ *5CC ncn 

/.uuue-vi7 ^ dd — <j b / 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
Xv Job- J /S* PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins . 


BL00380F 9.76 6.694e- 
11 542-553 


355 


PF00628 


nifT)- T i nrto v 
irrLU L Xfly c i • 


PF00628 15.84 1 . 000e- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9,700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 4. 46*2e- 
15 261-274 PD0006S 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4 .300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like net r in 
receptors. 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-l 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


3*3 


PRO 04 50 


RECOVER In familV 

oXVjrJNAi 


PR00450C 12.22 5.0B0e- 
10 73-95 PR004S0C 
12.22 3.27Be-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N-terminal domain 
proteins . 


PFO0242O 13.51 2.328e- 
09 22-68 


365 


PF'66242 


DNA polymerase { viral) 
« w«s£iiiui(*x uotnain 
proteins , 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


3d7 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11. 36* 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
Q3 370-384 


36,6 


PR00011 


TYPE III EGF-LIKE 


PROOOllD 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01O32H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 
376 


PD01066 

PR00170 J 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 

SODIUM CHANNEL SIGNATURE 


PD01066 19.43 9.757e- 
34 26-65 

PR00170E 6\48 2.739e- " 
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RESULTS* 








10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


38X 


B-L00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 5.714e- ' 
12 50-66 


382 


PR00624 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00O78B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR INTKRLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 6.000e- 
10 97-130 


388 


PD00066 


PROTEIN ZINC- FINGER 
KETAL-BINDI. 


PD00O6S 13.92 5.000e- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.6S7e- 
09 151-174 


390 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.200e- 
15 221-246 BL0021SA 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins . 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.579e- 
11 141-155 


398 


PR0O761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BLOO240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF0067* 


Dehydrogenase El 
component . 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-lS 486- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.6*73e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14,95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin . 


PF00992A 16.67 S.974e- 
09 105-140 


4 04 ; 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


40* 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins. 


BL00232B 32.79 9.5S7e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SSQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE. 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.9SSe- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE j 


PR00109D 17.04 5.881e~ 
10 228-251 


429 


BL00518 


Zinc ringer, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORM IN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 j PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






P 15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


" PROOS^S 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5.551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
{SCR repeat proteins. 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790I 20.01 2.82le- 
09 618-649 


456 


PR0O38C 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.COOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


"457 * 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9,143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 S.950e- 
21 452-473 


467 


PRO 084 9 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BL00678 


Trp-Asp (WD) repeat 
proteins protein^*^ 


BL00678 9.67 8.200e-12 
33-44 


472 


BL0022£ 


I n t e r mecl i a t em^ifie n t s 
proteins . [ ** t **^ 


BL00226B 23.86 3.721e-~~ 
09 282-330 


473 


BL00344 


GATA- type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00482 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.07 0.9O9e- 
09 173-199 


4 79 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-408 


480 


PD01066 ' 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING^ 
PROTEIN SIGNATURE 


PPR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333G-19 430- 
448 PR00405A 17.71 
4.971e-18 411-431 


482 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 1 
PR00049D 0.00 1.30Se- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8.615e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00S67 


PROTEIN RNA- BINDING RNA " " 

Repeat hyd. 


PD00567B 18.23 2.853e- 
09 200-214 


488 
489' 


PR00988 


URIDINE KINASE SIGNATURE ' 


Pft00988A 6.*9 4.S69e- 
12 3-21 






PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


^R00049D 0.00 7.86'4e- 
09 663-678 


492 
497 


BL01128 
PF00429 


Shikimate kinase 
proteins . 

snv polyprotein (coat 


BL01128A 18.84 6.464e- 
17 58-92 

fif00429 31.08 7.17le- 
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SEQ ID NO: 


" ACCBSSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins. 


BL00120B 11.37 7.923e- " 
09 185-200 


500 


""BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


WW/rsp5/WWP domain 
proteins. 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 


508 


" PR00120 


H+TRANS PORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM01417. 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group l. 


PF00534B 14.47 6.^25e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1. 


PFO0S34B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe-"" 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
IB .60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 

i , DU1841C 13.78 
9.3B6e-23 222-243 

trU\JXt3 l kxn lU.d/! o . 3?4e- 

21 1054-1073 PD01841I 
591 


514 


PR001S3 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 
ISOMERASE SIGNATURE 


DT?rtrt1 CO/** oi i i oo — 

rAUUisjv. 11 . UX / , looe- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A "* 3 87 7 T ARp. 
12 410-423 


516 


"DM50892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C Ifi Hfi R i9f) p _ 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 

IS 41 1 000a. OK 04,110 


52* ■■' 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e- " 
10 61-95 


526 


PF00789 


Domain present in 
ubiqu i t i n - regu la t ory 
proteins . 


PF00789B 19.70 3.308e- " 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 ■ ■ 


BL01162 


Quinone oxidoreductase / 

zeta-crystallin 

proteins. 


BL01162C 22.80 1.500e- 
16 120-164 
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1 QFn :n ho. 


ACCESSION 


• DESCRIPTION 


RESULTS* 




«U . 






529 




LUituviKUo UKro rROTLIN 


PR0Q91QA 2.51 3.B93e- 






C T/"" TiT TV TW TO TT» 


09 60-73 


532 


RT.nnn c 


Mitochondrial energy 


BL00215A 15.82 4.000e- 






transfer proteins. 


17 11-36 BL00215A 








15.82 8.660e-ll 123- 








14 8 


£33 


DT flAOl C 


Mitochondrial energy 


BL00215A 15.82 4.000e- 






transfer proteins. 


17 11-36 BL00215A 








15.82 8.660e-ll 97-122 




qt A n r» a a 


Thiolases acyl-enzyme 


BL00098C 21.65 2.800e- 






intermediate proteins . 


38 181-227 BL00098B 








32.59 5.345e-38 86-141 








BL00096D 26.30 8.364e- 








35 245-288 BL00098E 








22.12 1.000e-34 314- 








352 BL00098F 10.18 








4.97le-22 365-386 








BL00098A 10.60 6.455e- 








11 38-50 


53 5 


PR00370 


FLAVIN- CONTAINING 


PR00370E 11.96 7.429e- " 






MONOOXYGENAS E ( FMO ) 


22 321-340 PR00370D 






SIGNATURE 


16.33 6.143e-21 185- 








204 PR00370F 17.75 








6.559e-21 376-396 








PR00370B 10.91 9.591e- 








21 27-46 PR00370C 








12.72 3.500e-20 140- 








157 PR00370A 3.35 








6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 


BL00028 16\07 7.429e- 






domain proteins. 


16 285-302 BL00028 








16.07 6.294e-14 341- 








358 BL00028 16.07 








1.346e-ll 369-386 








BL00028 16,07 1.692e- 








11 397-414 BL00028 








16.07 4.4S2e-ll 453- 








470 BL00028 16.07 








7.23le-ll 425-442 








BL00028 16.07 4.300e- 








1U J1j-Jju 


S3 4 ? 


UT nmci 
oiiuu / b a 


WHEP-TRS domain 


BL00762A 23.43 9.419e- 






proteins . 


15 844-881 


538 




wHEF-TKS domain 


BL00762A 23.43 9.419e- 






proteins . 


15 819-856 


539 


rt nc\nc?5 ' 


whep-trs domain 


BL00762A 23.43 9,419e- 






proteins . 


15 822-859 


540 




LEUCYL-TRNA SYNTHETASE 


PR00985A 12.10 9.000e- 






S IGNATURE 


10 357-375 






SUB UNIT E V-ATPASE 


PD02102A 16.74 l.OOOe- 






VACUOLAR ATP SYNTHASE 


40 3-47 PD02102B 






UVr\TJ AT 


18.28 4.375e-34 57-100 








PD02102D 21.69 1.923e- 








30 179-218 PD02102C 








26.34 B.929e-26 100- 








14 6 


543 


OJJU U VJ AO 


, 

Zinc finger, C2H2 type, 


BL00028 16.07 l.OOOe- 






domain proteins. 


10 48-65 BL00028 








16.07 6.400e-10 193- 








J. \J DUUUV&O ID , 1/ / 








1.000e-09 343-360 








BL00028 16.07 6.914e- 








09 78-95 


545 


BL00250 


TGF-beta family 


BL00250A 21.24 B.OOOe- 






proteins . 


31 293-329 BL00250B 








27.37 5.286e-24 354- 








390 


547 


PRO 03 19 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 106-201 PR00319A 
15.27 7.344e-09 210- 
227 


54 8 


BL01204 


NF-kaDDa-B/Rel/ dorsal 
domain proteins . 


Ri.fiionia 1*7 ~ia 1 nnria 
mj\Ji£\j'*j\ X/ . i «* l.uuue- 

40 8-56 BL01204D 

16 42 1 1*7*7- 

221 BL01204E 13.83 
7.652e-30 225-250 
BLO1204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


54 9 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 


551 


PF00632 


HECT-domain (ubiquitin- 
transf erase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


-554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 l.GOOe- 
14 187-205 BL00290A 

153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-B79 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Poly-adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e- 
32 118-155 


"564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins. 


BL00141A 12.10 4.150e- 
10 472-488 


~566 ' 


PF00855 


PWWP domain proteins . 


15 272-289 


"567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PDOlOti* 19.43 4.977e- 
13 229-268 


569 


BL00107 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins. 


1BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.S0Oe-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


C l\.\J U X ZJ J U It . JO 1 1 <13 /c 

34 454-483 PR00193C 
12.60 2,636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


"573 


PR00193 


MYOSIN HEAVY CHAIN 

SIGNATURE 


PkooTTTri T3" ' -VK i Rtt7ft, — 

ft^VVJ J>ii JD Jl .03/8- 

34 470-499 PR00193C 
12 60 2 f"."iGf»-"*1 919- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A IS. 41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 

... 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-2S5 


57/ | BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 



209 



WO 01/53312 PCTAJS00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


"DESCRIPTION 


RESULTS* 






proteins. 


13 864-877 BL00116B 
11.82 1.529e-12 952- 
965 


578 


BL00195 


Glutaredoxin proteins . 


09 121-141 


579 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e~ ' 
11 217-231 PR00019B 
11.36 1.350e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
DPrtnm on ii it q aorta 

09 363-377 


580 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 

i 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


COMPLEMENT-BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- j 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


"kw SKI2W SKI2 NUCLEOLAR 

HTTI.THUQU! 
nc jj j, ^rtOJB . 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.53 9.491e-30 916- 
963 DM01537A 15.14 
3.196e-ll 784-804 


586 


PFC0013 


KH domain nirohf*"! n« 
family of RNA binding 
pro Le ins . 


rrUUUlo o. 78 1.45UC-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00B92C 23.55 4.409e- 


589 


BL00478 


LIM domain proteins. 


BL00478B 14.79 1.643e- 

J.J 6D1-4 (D 0llUU4 f OO 

14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 931-948 


"591 


PF00855 


PWWP domain proteins. 


PP00855 13.75 S.OOOe- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


Dcnnno oa o a c c 
crVVbZo lb.o4 3.455e- \ 

12 424-439 


594 


PR00205 


CADHERIN SIGNATURE 


rhivvZUOB 11. .49 2.2416- 

16 558-576 PR00205A 

558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BLOO107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 


6*00 


BL00242 


Integrins alpha chain 
proteins . 


BL06242E 9.03 9.$91e- 
27 985-1014 BL00242C 
16.86 4.115e-25 286- 
316 BL00242D 13.57 
4.150e-25 357-3B2 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


1 ACCESSION 

NO. 




KC-oUJLt lo w 








S.OOOe-ll ^1-73 
BLO0242D 13 57 4 qqcp 

10 291-316 


601 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


09 198-213 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.E>fc>9e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479C 12 01 3 250*»- 
12 170-183 


£04 


BL00315 


Dehydrins proteins. 


BL00315A 9.35" 1.672e- " 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 

10 295-339 1 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5 . 16"7e- 
15 265-282 


609 


PP00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10 ;69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- \ 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-91B DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8,291e-09 767-787 


615 


"PD02699 


PROTEIN DNA-BIND1NG 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-423 PR00380C 
13.18 2.97be-13 436- 
455- 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PRO0380D 
3 - y j J. /2ie-17 486-508 
PR00380B 12.64 2.241e- 

13.18 2.976e-13 436- 
455 


613 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM012C6B 10.69 5.143e- 
TO C7i_cci nMmonco 

DJl-DDJ. Um\JX*ivOtS 

10.69 2.603e-10 535- 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 3 . 160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3 . 222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC MOLYBDO PTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


S24 


BL00641 


Respiratory- chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BLO0641E 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 






subunit proteins. 


24. i7 1.000e-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1 . 816e- 
37 48-60 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 




PR00103 


CAM P - DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 

J 6 trKUUlUJrt i) ,33 

2.957e-14 282-297 
PR00103D 10.83 3.077e- 

15 Td(v-TC.ft DDnnimn 
±4 J1D"J3d rnUUlU jl, 

15.68 1.000e-ll 334- 

1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PRbOOBlA 10.53 b\£lle- 
16 4-22 


*31 


PF00651 


BTB (also known as BR- 
C/TtJc) domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 


DM01206B 10. £9 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DMU12Q6B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
/.2bbe-09 1326-1346 


S3 5 


3L00107 


Protein kinases ATP- 
ijj-iiu.j.ny icy ion proicins. 


BL00107A 18.39 7.600e- 

143-176 BL00107B 
13.31 2.636B-13 211- 
227 


~636 


BL00657 


Fork head domain 
proteins. 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


fc»F00628 


PHD - £ i ncrpT* 


13 385-400 PF00628 
479 


648 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13 25 4 000p- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BIi01129B 12.51 
6.118e-13 191-212 


649 


BL0122 8 


Hypothetical cof family 
proteins. 


BL01228D 17.44 3.90Be- 
10 455-480 


650 


Bt60627 


' Homeobox ' domai n 
proteins. 


BL00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA - AM I NOBUTYRI C ACID 
(GAB A) RECEPTOR 
SIGNATURE 


PRO0253A 9.1$ 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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seq lb NO: 1 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-l0 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 B.397e- 
09 580-59S 


659 


DM00215 


PR0LINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM0021S 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM0021S 19.43 7.107e- 
10 544-577 


660 


PRO0688 


XYLOSE ISOM ERASE 
SIGNATURE 


PR00688I 13.78 9.51Be- 
09 224-236 


661 


BL00027 


•Homeobox 1 domain 
proteins. 


BL00027 26.43 5.950e- 
23 249-292 


U2 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.15BB- 
10 596-610 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


^66 


PR00819 


CBXX/CPQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.9QQe- 
10 704-720 


4*7 


BL50040 


Elongation factor 1 
gamma chain profile. 


BLS0040C 22.62 2.143e- 
16 135-178 


66B 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.5S7e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-60B 


675 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 
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SEQ ID NO: ' 


NO. 


UWLKJ. J? i ION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
xj.ux b.4UUe-10 572- 
587 PR00320B 12.19 


676 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.667e- 


679 


PF00642 


Zinc finger C-xB-c-x5-c- 
x3-H type (and similar) 


PF00642 11.59 3.700e- 

AO i*i?00642 

11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.200e- 
19 227-257 


"*82 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4 . 000e- 
09 99-118 


687 ~ 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 

10 538-553 ! 


689 


BL01024 


Protein phosphatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26" l.OOOe- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13 .22 1.000e-40 185- 
222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BLO0027 ■ 


'Homeobox' domain 
proteins. 


BL00027 26\43 8.071e- 
31 152-195 


692 


BL00211 


proteins. 


-ULtUQZllA 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL006 80 


Methionine 

aminopeptidase subfamily 
«l proteins , 


BL0068O 14.37 5.304e- 
17 173-195 


697 


BL-00741 


Guanine-nucleotide 

uiaoytiatiotl SLXTTVUJLaCOrS 

CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


£98 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930P 
14.16 B.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


. 700 


PR00869 


dna-polymeraSe family 1 x — 

SIGNATURE 


rKUUboSA 12.80 1.281e- 
16 245-263 


701 
702"* 


PR0004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 




BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- " 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.9096- 
15 86-98 BL00523C 
12.64 S.S00e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00S23F 
b.Jble-09 413- 

424 


703 


PRO 0048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


rKyyy^BA iu.52 8.4l2e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007Q7A 14.84 B.941e- 
14 66-82 


708 


PR007S1 


SIGNATURE 


PR00761E 14.32 8.50De- 
10 822-841 


712 


DM01354 


* kw TRANSCRIPTASE REVERSE " 
II ORF2. 

j 


DM01354 Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD-box subfamily ATP- 
dependent hel leases 
proteins. 


BL00039D 21.67 7.545e- 
27 450-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BLO0039B 19.19 1.947e- 
13 194-220 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e~28 04-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22,63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 3S8-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 B.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE <C2) FAMILY 
SIGNATURE j 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.07le-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 










PR0O"7OdR 1*7 OA *> Oiio. 

23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00134 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e-" 
09 169-187 


"726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 2. ^Se- 
ll 3 *?T7-*JQ*> DTJAAIOAR 

16.74 1.310e-ll 277- 
292 PR00320C 13.01 

PR00320A 16.74 6.586e- 
11 323-338 PR0Ol9nR 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PROdis-s- 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.B2 3.912e-ll 457-474 


"733 


PF00642 


X3-H bype (and similar). 


kWVUb42 11.59 9.082e- 
10 787-798 


738 


BL00039 " 


DEAD -box subfamily ATP- 
dependent he li cases 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 BL00039C 15.63 
y.lUOe-13 IdO-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
J.U.4b 3.b7le-17 353- 
383 


742 


BL01019 


ADP-ribosylation factors 
f ami 1 v nr*ot"p4riB 


BL01019A 13.20 7.07Be- 
12 41-81 


"743 


BL00965 


Phosphomannose isomerase 
tvne I mrofcpinn 


BL00965C 23.78 1.000b- " 

/in *jcr me DTnrtatcti 
<!ot)-JUa D.LU09obB 

17.77 1.600e-25 126- 

*3J DjjU USD XU . D / 

6.400e-19 94-113 


747 


BL00021 


Kr ingle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5 345e-2l fif1-7ft 


748 


BL00612- 


Osteonectin domain 
proteins . 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 " "" 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 


"752 


BL00795 


Involucrin proteins. 


BL00795C 17.06 6.000e- 
11 384-429 BT_j00795P 
17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein L39e 
proteins. 


BL00051 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BLO102O 


SARI family proteins. 


BL01020C 15.35 9.020e- 
12 99-150 


762 


3L0004 6 


Histone H2A proteins . i 


BL60046" 12. 9S l.flOOe- 
40 33-88 


76^3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


■ Home obox 1 doma i n 
proteins . 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.063e- 
10 309-324 BL01208B 
15.83 8.031e-10 165- 
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SEQ ID NO: 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



180 BIi01208B 15.83" 
4.162e-09 85-100 



BLO0031 



772 



PRO 04 4 9 



773 



BL00523 



BL0O028 



776 



BL0D028 



~TTT 



778 



BL0O028 
BL0O030" 



~779~ 



PR00079 



781 



Nuclear hormones 
receptors DNA-binding 
region proteins. 



BL00Q31A 19.55 9.571e- 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 



TRANSFORMING PROTEIN P2"l" 
RAS SIGNATURE 



suitatases proteins. 



Zinc finger, C2H2 type, 
domain proteins. 



PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 



BL00523S 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
13 91-103 BL00S23D 
9.89 7.923e-12 224-236 
BLC0523C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 



zinc finger, C2H2 type 
domain proteins 



Zinc finger, C2H2 type/ 
domain proteins . 



BL00028 16". 07 7.686e- 
09 568-585 



BL00028 16\07 7.686e- 
09 621-638 



Eufcaryotic RNA-binding 
region RNP-1 proteins. 



GLUCOSE- 6 -PHOSPHATE ~ 

DEHYDROGENASE SIGNATURE 



BL00028 16.07 7.686e- 
09 595-612 



BLO0O30A 14.39 8.4l2e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 



PR00079B 12. 9B 2.929e^ 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7 . 070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 



~78T 



PD00209 



785 



BL00690 



PR00449 



788 
T9lT 



DM01206 



BL00915 



Mitochondrial energy 
transfer proteins. 



PROTEIN SH3 DOMAIN" 
REPEAT PRESYNA. 



DKAH-box subfamily ATP- 
dependent helicases 
proteins. 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



CORONAVIRUS NUCLEOCAPS 1D 
PROTEIN. 



Phosphatidyl inositol 3- 
and 4 -kinases proteins. 



BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 



PD00289 9.97 6.276e-09 
159-173 



BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 

09 218-228 

PR00449C 17.27 8.5"00e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 l.S45e-09 111- 
125 



DM01206B 10.6"9 8.767e^ 
10 1-21 



BL00915C 22.43 9.182e-" 
39 725-764 BL00915B 
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SEQ ID NO: 


""ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


' PRu0208 

/ 


GLIADIN AND LMW GLUTBNJn 
SUPERFAMILY SIGNATURE 


PR0020BA 12.59 £.294e- " 
10 120-138 PR00208A i 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A j 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR002O8A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7,904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodulin (GAP- 43) 


BL00412D 16.54 4.000c- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 




BLi00021B 13.33 6.339e- 
13 40-58 


799 

• 


BL01052 


Calponin family repeat '"" 
proteins . 


AbuJLUsic lo.bl l.OOUe- 
40 87-127 BL01052A 

BL01052B 15.31 1.257e- 
25 52-78 BLOlO^^n 
10.26 5.737e-25 174- 
194 


800 


BL0u348 


p53 tumor antigen 
proteins. 


BL00348F 23 19 3 714e~ 
09 197-240 


801 


DL00309 


vertebrate galactoside- 
binding lectin proteins. 


BL00309C 18.65 1.621e- 
09 62-87 


802 


PRO 0245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyndine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PKO0667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 

NO 
Vi\J . 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS . 




811 


BL00685 


CBP-A/NP-YB subunit 
proteins. 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPBRFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


"aii 


BL00357 


Histone H2B proteins. 


BL00357 7.74 1.908e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00D66 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
•14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


pi c " 

Bib 


BL01195 


Pep t idyl -tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 ! 


820 


BLC0520 


Interleukin-10 family 
proteins. 


BL00520A 6". 21 6.47le- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins . 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVOPROTEIN PROTEAN" 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732c- 
26 88-124 PD02855B 
8.36 6.476e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.2B3e- 
13 25-45 


831 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00O19B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.B52e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e~ 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 3.898e- 

U7 / O - 1 A. 1 


"839 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEOPROTE I N . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
53 8 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOM&TN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAT. flft<J 
L22 RNA-BINDING HEP . 


ryU«/03ti 11.4.5 l.OQGe- 

40 58-112 PD02785A 

1 "7\ t qiCo.Ta a ci 
J.D . £j 1 .jiDe-^o o-ij/ 


845 


""BLC0826 


MARCKS family protein3. 


BL00826C 7.*3 6.738e- 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL0051S 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR0030B 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.32le-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4,205e-26 163-218 
BL00420B 22.67 S.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.32le-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL0O420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.83le- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 | 
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SEQ ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








7.95Se-10 567-578 ' — 


857 


PR00388 


3 1 , 5* -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


09 64-83 


859 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 

17 DT? nnQOQP 
*- * <c j — *» j. trt\ VW.700L 

13.64 8.714e-16 107- 
123 PR00989P 12 0~K 
7.828e-15 198-212 
PR00988E 8,27 9.769e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR0098 8B 11.60 4.512e- 
10 60-72 


863 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


664 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.QOOe- 
24 198-221 PR0077SB 
J. 52 l.B37e-7.3 107-130 
PR00775D 8.91 4.484e- 
X/ X 71-189 PROQ775A 
9,90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
X/ X3J-171 rR0U775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 

9. / D Jt! — X *t 7"ZD r 


866 


DM01688 


"2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
q« B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL01287 


RNA 3' -terminal 
phosphate cyclase 
proteins. 


BL01287A 17.95 2.688e- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19 4^ fi 1 dfidv*- ~ 

* V W _J l ^ ■ *t J O » *x © *x w 

10 304-337 


872 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biotin-reguiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.03£e- 
32 665-711 


"876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 1 6 74 4 n^P-' 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.QOOe- 
40 35-71 BL011B9B 
13.49 1.000e-40 71-125 


882 


HL002 84 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17 99 6 lfl>#»-19 ic.cc 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39* 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


""ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL0O039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BLO0039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BINDI . 


PD00066 13.92 8.200e- 
13.92 8.200e-16 282- 

^ ri/VvUDO i 7^ 

8.200e-l5 310-323 
PD000SS 11 P. ?nr\/> 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
a.200e-14 338-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.l60e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.5S6e- 
25 335-356 PR00381B 
IB. 17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.181e- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
3. 75 8. 3646-10 377-398 
PR00381D 13.94 5.230e- 

12.48 7.120e-09 310- 
329 


906 


PRO 03 45 


STATHMIN FAMILY 
SIGNATURB 


tr l\\J \J J<± DL t . D *» 0(33/8" 

09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


09 513-537 


908 


tti0067'B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9 67 9 Tn~fli*»-."ii 
144-155 


910 


PD01O66 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01056 19.43 £.8Q0e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins. 


BL01104C 15.14 6.000e- 
09 364-392 


922 


3L0Q678 j 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09"" 
500-511 


923 


PR00320 


<3-PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOR OPHVLLIDS 
REDUCTASE PHOTOSYNT, 


PD02181D 12. 8S 8.609e- 
09 36-54 


926! 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019C 14.66 7.453e- 
25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 " 


BL00678 


xrp-Asp [WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


1 DESCRIPTION 








proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL0067B 9.67 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


"BL01085 


Ribulose -phosphate 3- 
epiraerase family- 
proteins. 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
□iiuiuoDCi JLo . o / o.b/oe — 
20 172-202 BL010B5C 


931 


BL01085 


Ribulose -phosphate 3- 
epimerase family 
■ proteins. 


BLC1085D 16.55 4.600e- 

Oa nrninncn 
j. J* loj aiiuiut) jo 

10.15 5.680e-22 30-52 

DUU iwODCi J. O . a / O.QfOC - 

20 190-220 BL0108SC 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTBIN REPEAT MUSCLH 
CALCIUM-BI. 


09 160-171 


936 


PF00168 


C2 domain orate ins 


12 336-362 


93 7 


BL00415 


synapsins proteins. 


BL00415N 4.29 9.519e- 
10 5-49 


940 


PR00862 


PROLYL OLIGOPEPTIDES 
SERINE PROTEASE ( QQA\ 
SIGNATURE 


PR00862D 16.17 4.086e- 

f\Q CI Qd 


945 


BL01230 " " 


ft* 1 " iticj \m iiy x l. i alio i. ts £ doc 

trmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


94 6 


BL00479 


diacylglycerol binding 
domain proteins . 


obUU4/ya 12.57 7.429e- 
18 52-68 BL00479A 
lif.ob ^ . .sUUe-lJ 2o-49 


949 


BL0067B 


Trp-Aap (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


"PD01311 


PROTEIN OXIDOREDUCTASE 
NAD INTERGENIC RE. 


10 66-111 


955 


PF00651 


BTB (aliA Jcnfwn aq ud_ 
C/Ttk) domain proteins. 


12 47-60 


956 


PF00651 


***** 1**AmV/ «\UwVT4i CI O DI\^ 

C/Ttk) domain proteins. 


rruuoDi id.uu j.iibue- 
12 47-60 


9S7 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins . 


BL00379 24. „4 l.fJlOe- 
15 111-148 


959 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


10 31-75 


960 


BtiOiiiS 


GTP- binding nuclear 
protein ran proteins. 


14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15. 06* 8.200e- " 
11 210-225 


966 


PRO 03 08 


TYPE I ANTIFREEZE 
PROTEIN QTflMATTTRE 


PR00308A 5.90 7.035e- 
uy lib- /u 


"967 


DM01206 


OORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DMOI206B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PF01008 


initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-l8 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


MLUbOO i. KJDl 

NO. 




RESULTS* 


970 


BL01277 


Ribonuclease PH 

n y-i-i hoi no 

proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


97S 


BL01159 


ww/ iu^j/nwr uumain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 


j Ribosomal protein L17 
proteins . 


BL01167B 20.6^ 8.258e- 
19 88-127 


97$ 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BLO0478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQOESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


P*6o$92 


Troponin . 


PF00992A l6.£7 8.816e- 
09 414-449 


982 


PRO 02 9 9 


ALPHA CRYS TALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-S9 BL00795C 
17.06 7.S02e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17 ■ 06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 

no t ao 


987 


3L00939 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 S.393e- 
no n i no/in 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e~ 
11 497-513 


994 


"BL00027 


'Horaeobox' domain 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL01304 


ubiH/COQ.6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00925C 16.07 1.7S0e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.2lle-23 217-240 
PR00926E 11.70 6.625e- 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1O05 


BL00406 


Actins proteins. 


BL00406B 5.47 1.600e- 
40 88-143 BL00406C 
6.75 l.OOOe-40 147-202 
BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL0O4O6A 9.95 3.348e- 
29 11-46 


1006 


BL004O6 


Actins proteins. 


BL0O4O6B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 


PR00304 


TAILLESS COMPLEX 
\ POLYPEPTIDE 1 
! (CHAPERONE) SIGNATURE 


PROO304D 11.04 8.714e- 
22 384-407 PR00304C 
8 69 4 667^-20 00.11 a 
PR003C4B 11.60 7.577e- 
19 68-87 PR00304A 

? . £,\J J . JO«C" XO ^B"bj 

PR00304E 7.79 6.870e- 
13 416-431 


1009 


-PD61066 


PROTON ZINC FINGER 

ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BI>00£lB 


Zinc finger, C3HC4 type 
(RING finger) , proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 1.000a- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- """ 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphoglycerate mutaae 
family phosphohistidine 
proteins. 


BL00175A 15.42 5.179e-" 

12 6-26 BL00175C 

23 . 75" 8 . 062e-IO 79-111 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR06305D 1^.34 1.439e- 
10 158-185 


1026 


&L00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14 . 83 8.844e-ll 288- 
335 


1028 


BLO0183 


Ubigui tin-conjugating 
enzymes proteins. 


BL001B3 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


"1034 


PR00413 


HALOACID 

DEHALOGENASE /EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PD01066 " "' 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.657e- 
09 5-44 


1038 " " 


PD01796 ■ ' 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 l£.6i 4.259e- 
11 55-82 


103 9 " 


BL00299 


ubiquitin domain 
proteins. 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 ff. 143ft- 
20 56-78 PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-l8 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-15 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


"1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR0004 8 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


OLUUDID 


C-type lectin domain 
proteins. 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


"Bl701092 


Adenylate cyclases 
class- I proteins. 


BL01092N 13.54 6.924e- 
10 3-40 


104 7 — 




ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins. 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


JUMUUU J X 


IMMUNOGLOBULIN V REGION. 


•DM00031B 15.41 7.C10e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins. 


BL01073 24.30 l.OOOe- 
40 12-62 


"1054 


BL60571 


Amidases proteins. 


BL00571 25.69 5.875e- 
31 160-212 


1055 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins . 


BL0003 0A 14.39 5.235e- 
11 98-117 BLO003OB 
7.03 4.316e-09 137-147 




BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


•Homeobox' domain 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


106 J 4 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13. 3i (J.21le- 
13 280-296 


1065 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


1066 


Dunn -a 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14- 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 




PRECURSOR. 


PD02870B 18.83 8.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PF00056A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5 /PR- 1 /Sc 7 
proteins. 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXY PE PT I DAS E C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


"1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL0021SA 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BLOOc^ 


Trp-Asp (WD) repeat 


BL00678 9.67 4.316e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 . 


BL0O326 


Tropomyosins proteins. 


STjOQ3?fiA Id fll 7 iqpo. 
"uw u jtu n. x *» . \j x / . j7Cc 

10 23-57 


1094 


BJ .00460 


Glutathione peroxidases 
selenocysteine proteins . 


BL00460A 28.67 3.204e- ' 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.50Oe-O9 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD02811A 20.^7 3".017e- 
22 67-105 PD02B11R 
17.07 2.263e-21 118- 
151 PD02B11C 13.25 
5 696e-13 154-1K7 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PDQ2811R 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 

\J U C. X. 0 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e~ 
13 111-147 


1109 


PRO 044 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
1U / PKUU449E 
13.50 1.857e-09 185- 

aUO c KU U«4 Zt LI xU. / 3 

8.364e-09 131-145 


1115 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


rKuunuau xx . oj o . /j /e— 
20 42-60 PROC405A 

PR00405C 19.41 6.902e- 
10 63-85 


1116 


BL0O35S 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e"25" _ 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5. 97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.857e- 
10 290-306 


1123 


PR00412 


BPOXIDE HYDROLASE 
SIGNATURE 


12 301-324 


1125 


PRO 01 86 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1129 


BLO0170 


Cyclophilin- type 
peptidyl -prolyl cis- 
trans i some rase 
signatur . 


BL00170C 18.49 3.077e- 
33 84-129 BLO0l7rm 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BLOOf^e 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e«14 59-80 


1132 


BLuu£78 


Trp-Asp (WDJ repeat 
proteins proteins. 


BLOODS $.67 £.2116-09 
29-40 


1133 


BL00678 


Trp-Asp {WD) repeat 
proteins proteins. 


BL00678 9 67 71T#>-r>Q "" 
29-40 


"1136 4 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins. 


BL00990C 18.78 4.17£e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.l25e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY " " 
PROTEIN SIGNATURE 


PR00314B 15.68 B.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 



227 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14. S3 1.281e-22 13-34 


1139 


BL01115 


GT?-binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e-" 
13 13-57 


1141 


BL00107 


Protein Kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
19 451-482 BL00107B 
13.31 3.077e-12 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.B73e- 

13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- — 
09 221-21? 


1163 


PR00624 


HISTONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


intermediate filaments 
proteins . 


BL00226B 23 . 84 7.384e- — 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- " 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA KD-40 
REPEAT SIGNATURE 


PR00320A 16.74 1 . 794e- ' 
J.U ^Ub-^20 PR00320C 
13.01 7.840e-10 205- 

<£^U <rKUQ32 LJo 12.19 

8.457e-10 35-50 
PR00320A 16.74 7.146e- 

12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR004S4D 10.89 4.150e- '" 
19 765-784 


1181 ™ 


BL00291 ™ 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-1A7 


1184 


BL00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-nn 


1185 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


by- 6 / u-PAR domain 
proteins. 


BL00983C 12 69 7 7GTp- 
10 77-93 


1168 

1191 " " 


BL00878 " 


orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6.000e- 

16 189-204 BL00878C 

17 74 R aico.ic one 

245 BL00878F 19.67 
3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1193 1 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B loVio 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 




PR0034 5 


STATHMIW FAMlXY ' 

SIGNATURE 


PR0034SB 7.12 2.800e- 
28 72-101 PR00345E 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR0034SD 
10.97 1.964e-24 125- 
149 PR00345A 13.46 

£4f>«»..l£ A"i-Ci 
9. O<i3S"X0 4J-bJ 


1194 


"PR00345 


STATIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
?fl 10P-1 17 BDnniicp 

■SO IvC'AJ / Fj\UUj*iOfci 

8.54 7.652e-28 185-210 

PR0O34 , 5C 4 54 Cj 1 flOo 

28 137-161 PR00345D 
10 97 1 9fi4r*-?d 
IBS PR00345A 13.46 
5.645e-16 79-98 


1195 


PP00995 


Seel family. 


13 224-264 


1196 


BL00932 


Bacterial- tvoe ohvtoene 
dehydrogenase proteins. 


OUUUJOfin lO .41 Q , /Joe— 

11 15-47 


1197 


' BL01298 


Dihydrodipicolinate 
reductase proteins. 


BL01298A 13.90 5.9S9c» 
09 51-73 


1203 


BL00061 


Short-chain 

dehydrogenases /reductase 
s family proteins . 


BL00061B 25.79 l.OOOe- 
14 152-190 


1264 


PRuOllS 


oa Art- i_w-Vw l/il*i/4oil< Ci^Aoo A. 

SIGNATURE 


PR0011BF 16.42 9.386e- 
09 213-229 


1206 


BL01183 


ubiE/C0Q5 

»wz utiyx t AaiiSierase rami J. y 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3 . 250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G -protein coupled 
tcceptors xainixy J 

proteins. 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.85*76- 
11 49-65 P700023B 
14.20 1.818e-09 45-55 


1212 


PRO 0 04 8 


SIGNATURE 


PR00048A 10.52 7.750s- 
14 227-241 PR00048A 
iv . 96 4 . jioe-ii 199** 
213 


1213 


PR004 50 


RECOVERIN FAMILY 
SIGNATURE 


10 20-42 PR00450C 
12.22 3.506e-O9 56-78 
PR004S0D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromoduiln (GAP-43) 
proteins. 


BL00412D 16.54 S.598e- 
10 179-230 


1219 


PRO 04 56 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3,06 5.348e- 
11 249-264 


1222 


PD00066 


**»waijA*^ iJ HI v- IT -L Ju I\ 

METAL- BIND I . 


rUOOObo 13.92 7.231e- 
15 295-308 PD00066 

"11 i "5 *i "i a _ i re Anc 
-Lj.J* /.zJie-l3 41)6- 

419 PD00066 13.92 

2 7P.£a-13 17fl-"tcn 
^ i«oqc 14 J / o — -J yj, 

PD00066 13.92 7.857e- 

12 434-447 PD00066 

13.92 3.348e-ll 350- 

363 


1223 


BL50058 


G-protem gamma subunit 
profile. 


BL50058 27.23 i:0Cbe- 
40 13-61 | 


122* 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 fl.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


""BL0116O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 6-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR0D735A 11.19 6.857e- 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1233 * 


PRO 04 97 


NEUTROPHlt CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


123*7 


BL00027 


•Homeobox' domain 
proteins. 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184c- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


10 31-46 PD01I68L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.6l2e- 
10 183-198 


1249 


BL00018 


EF-hand calcium- binding 
domain proteins . 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL001B3 


Ubiqui tin -conjugating 
enzymes proteins . 


BL00183 28 97 ~> £4 Da- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphoribosylglycinamid 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE lit EGF-LIKE 
SIGNATURE 


PR0Q011B 11 Ofl 1 91 7». 
10 174-193 


1259 


BL00518 


Zinc finger, C3IIC4 type 
(RING finger), proteins. 


DL00518 12.23 8.2B6e- 
10 31-40 


1261 


PRO 00 70 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 

J. j iiz-iz / trrtUUU /UL- 

13.09 9.500e-lS 51-63 
PR00070A 12.92 5.500e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-typc, 'helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e^ 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00B37C 17.21 2.714e- 

1S 165-182 PR00fl^7A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 
12 201-215 


i2*9' 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17 71 q inHP- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


channel forming colicins 
proteins. 


BL00276A 8.87 1.500e- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. j 


PD02327C 15.47 9.769e- 
09 220-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 S.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.94 9.325e- 
13 128-145 


1280 


BL01220 


Phosphatidylethanolamine 
-binding protein family 
proteins. 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain puesent in ZO-1 
and UncS-like netrin 
receptors. 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16.51 1.6l0e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3 .S7le- 
28 82-126 BL.00197R 
26.57 8.800e-28 23-68 


1302 


"PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15 B2 5 500f- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR "' 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 5S2-572 


1309 


PD00301 


" PROTEIN fcEPEAT Mt)SCLE~ 
CALCIUM-BI . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins. 


BL00194 12.16 1.966e- 
11 15-28 


1314 


BL00S94 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein LI 3 
proteins. 


BL00783C 22.43 6.559e- 
24 07-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PFD0514 


Armadillo/beta -cat enin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL0003 0 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL ' 
FACTOR P40 SIGNATURE 


FK00497A 6.92 7..239e- 
09 25-43 


1332 


PROOlrfl 


NICKEL- DEPENDENT 
H YDROGENASE/B- TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PDOlOfc^ 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01666 19.43 6.769e- " 
33 10-49 


133d 
"133 7 " 4" 


PR00700 
PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 
PROTEIN TYROSINE 


FR00700D 12.47 2.200e- 
09 262-281 

fR00700D 12.47 2.200e- 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 " - 


1340 


PR00860 


VERTEBRATE 
METAL LOTHIONEIN 
SIGNATURE 


PR00860A 5.46 5.034e- " 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL01282B 30. 4$ 5.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE. 


DM00099B 14.73 8.313e- ' 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5.935e- " 
10 135-146 


1348 


PFOOe^l 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF0O651 15.00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.57le- 
32 416-445 PR00193C 
12.60 6.3l8e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE - 
ASSOCIATED. MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR0044 7G 6.69 9.077e- 
10 353-373 


1353 


BL00303 


s-100/ICaBP type calcium 
binding protein. 


BL00303A 21.77 6\667e- ' 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD- box subfamily ATP- 
dependent heli cases 
proteins. 


BL00039D 21.67" 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4 . OOOe- 
18 225-249 BL00039B 
19;19 3.182e-14 141- 
167 


1357 


PF0O615 1 


Regulator of G protein 
signalling domain 
proteins. 


PF00615B 16.25 2.216e- 
12 B4-101 PF00615C 
10.06 8.4l2e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.234a- 
29 10-49 


1361 


"PR00925 " 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3 . 73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.3146-25 249- 
274 BL01272A 6.49 
1.23le-18 99-117 


1363 

1364 


BL01272 " 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e~ 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 




DMD0179 


w KINASE ALPHA ADHESION ' ' " 
T-CELL. 


UM00179 13.97 5.304e- 
09 167-177 


1368 

1370 | 


PR00169 
PR00988 


POTASSIUM CHANNEL 
SIGNATURE 

JKIDINE KINASE SIGNATURE 


PRU0169A 16.77 1.592e- 
09 76-96 

PR009B8A £.3 9 1.7'$4e- | 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


X371 


BL00242 " ■ 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8.6l5e- 
09 469-479 


1372 


PR0062S ' " 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e-"" 
19 46-67 PR00625A 
12.84 1.391e-16 14-34 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins . ' 


BL00434C 23.85 3.778e- 
09 90-130 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE. 


PD02475A 23. IB 8.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


13 80 


BL00194 


Thioredoxin family 
proteins. 


BLC0194 12. 1* 8.333e- 
12 48-61 


1381 


DM01970 


0 kw Zk6"32.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
15 1123-1136 


1363 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9\*7 7.500e-10 
271-282 


1385 


BL0O303 


S-100/ICaBP type calcium 
binding protein. 


10 95-132 


138* 


BL01160 


Kinesan light chain 
repeat proteins . 


BL01160B 19.54 5.042e- 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER KETAL- 
BINDING NU. 


PD01066 19.43 3.6O0e- 
30 10-49 


1390 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0106^ 19.43 3.S12e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


rKyyjUHL 3.83 9.723C- 
10 127-137 


1393 


PR003 80 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 B8-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13:18 6.538e-16 243- 
262 | 


1394 


PD00066 


rftw * a *» iwv." v AHvfiK 

METAL- BIND I . 


rDUUUob 13.92 3.400e- 
14 462-475 PD00066 
13 .92 8.800e-14 348- 
Joi JrJJUUUbo XJ.?£ 
9.57le-12 405-418 • 
PD00066 13,92 6 0B7e- 
11 490-503 PD00066 
13.92 8.043e-ll 370- 
333 


139B 


PD01066 


protein 21nc Linger 
zinc- finger metal- 
binding nu. 


PD01066 19 43 6 786e- 
32 10-49 


U6o 


DM01204 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryotic RNA-binding " 
region RNP-l proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


14 66 - " 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.$S0e- 
11 179-193 PR00019A 
11.19 8.826e-l0 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1409 


PRO0510 


NEBULIN SIGNATURE 


09 176-190 

PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 5B-7S 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Ribosomal protein L5 
proteins. 


BL00358B 22. 7£ l.OOCe- 
40 57-103 BL00358C ' 
13.75 6.087e-l4 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.93le- 
11 33-44 


1414 


BLO0282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 1^.88 7.338e- 

10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00681 


RIBOSOMAL PROTEIN Si 
SIGNATURE 


PRU0681G 12.54 2.149e- 
09 38-60 


1418 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PRO 03 19 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 l,571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 l.OOOe- 
«U J.4<s-196 PD01941B 
15.02 7.049e-30 400- 

447 DDOI Q4 Iff it a-* 

2.475e-20 817-864 

19 488-543 PD01941D 
27.18 9.6l4e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


1422 


PROOFS 


CADHERIN SIGNATURB 


PR00205B 11.39 8. 043e- 
12 199-217 


1423 


PRO0209 


ALPHA/BETA GLIADIN 
FAMILY SIGNATURE 


fKuu^ubb 4. SB o.318e- 
11 1009-1028 


1424 


~BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


14 367-38C BL50002A 
14 .19 9 25Qe-l? 5Qfi_ 
317 BL5OO02A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF00628 


PHD- finger. 


PF00628 15.84 3.045"e^ 
12 330-345 


1426 


PF00628 


PHD- finger. 


PF00628 15.84 3.045e- 
12 377-392 


1427 
-1428 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR0040S& 11.83 S.li4e- 
16 281-299 PR00405A 
17-71 4.306e-14 262- 
282 




BL0O039 


dead -box subfamily ATP- " 
dependent heli cases 
proteins . 


BL00039D 21.67 5.219e~^~ 
34 147-193 


1429 


PR0U32Q 


U-PKOTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 

1431 " 1 


PR0O378 


INOSITOL PHOSPHATASE ~" 

SIGNATURE 

j RAVES DISEASS CARRIER ( 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
LB6 

fK00928B 13.53 3.769e- 
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SEQ ID NO: 


" Accession 

NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
<TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 i.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 2.500e- 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PRO 0806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PRO08O6 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
00 114-138 


1445 


PD01841 


PHOS PHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 l.OOOe-40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-2S8 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- | 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 

PDU1B41K 14 . 81 l.OOOe- 

35 1041-1071 PD01841H 

472 PD01841C 13.78 

PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS histone -family. 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR00048 


C2H2-TYPB ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 R I BONUCLE ASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region rnp-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18 60 2 Q?Ob. 
22 4-59 


1457 


BL00927 


Trehalase proteins . 


BL00927C 10 83 B OflSp- "' 
09 42-53 


1460 


BLOODS 


Aldose i-epimerase 
proteins . 


BL00545C 11.28 7.353e- 
17 169-182 

10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PRO0097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9;U*9e- " 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 




PF00686 ' 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1477 


PF00566 


Probable rabGAP domain 
proteins . 


PFO0S66A 12.64 7.333e- 
10 466-476 


1478 


BLOOQ3 0 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


GLIADIN. 


DM00406 7.73 8.*41e-10 
292-305 


1480 


BL00290 


Immunoglobul ins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PHOS PHOENOLP YRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF0O780 


Domain found in NIXl- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 l.lS3e- 
09 108-162 


1485 


PD0106* 


PROTEIN ZtNC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e-" 
09 34-50 


1488 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL0016S 


Bnoyl-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2,607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL06452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3 . 700e- 
31 63-106 DL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


I3L00107 


Protein kinases ATP- 
binding region proteino. 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1S02 


BL00027 " ■ 


* Home obox ' domai ri 
proteine. 


BL00027 ££.43 4.789e- ~ 
24 112-155 


1563 


BL00027 


•Homeobox' domain j 
proteins. 


BL00027 26 43 4 789e- 
24 112-155 


1505 


BL01177 


Anaphyla toxin domain 
proteins . 


BL01177B 20.64 5.800e- 
24 448-47S BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1*06 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972B 20.72 8.759e- 
10 341-363 


"1512 


BL00523" 


Sulfatases proteins. 


BL00523E 19.27 4.536e- " 
22 76-106 BL00S23D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BLO0 914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








331 BL006C0G 12.43 
9.625e-l7 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.04Ce-12 190- 
206 BL006C0F 8.77 
1.000e-ll 343-356 
BLO06O0D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PDO0930B 33.72 9.600e- 
18 41-62 


1528 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 

• 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12 19 
9.743e-10 106-121 
PR00320A 16 74 1 878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A IE 74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


153 8 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0.G0 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR0096SH 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 

rKUUsbsL 13 . \JH 1 . uuue- 

27 131-151 PRO0965D 

PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 

A3 o 3 — JJ rKUUJDD J, 

3.91 6.442e-25 385-406 


1S41 


BL01013 


Oxysterol- binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


BR lumen protein 
retaining receptor 
proteins . 


BL00951C 19.35 l.OQOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BTi00*JS1A 1 10 
1.000e-38 2-38 . 
BL00951B 14 23 6 2S0e- 
33 38-69 


1548 


BL00536 


Ubigui tin-activating 
enzyme proteins. 


BL00536F 13.65 B.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAG INAS E / GLUTAM I NASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5~.119e- 
09 58-73 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


15S6 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.l05e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.10Se- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 6 . 600e- 
18 412-436 BLO0522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6 . 123e- 
14 502-532 BL00522P 
14.90 2.385e-13 551- 
575 


1563 


PP006S1 


BTB [also itncwn as BR- 
C/Ttk) domain proteins. 


PF00651 15. 00 1.947c- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL010i3b 26.81 8.594e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL00678 


Trp-Asp l WD) repeat 
proteins proteins. 


BL00678 9.67 3.400e-10 
378-389 BL00678 9.67 
5.800e-10 418-429 
oliuud / o j . a / o.ouue— iu 
295-306 


1576 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


"BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6,625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PRbOo 1 ^ 


OXYTOCIN RECEPTOR 
SIGNATURE 

• 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 

1 ><Vve*££ JLJo-133 

PR00665F 11.73 4.000e- 
22 337-354 PR00665C ' 
5.89 1 OOOe-90 fiS-fln 
PR0D665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 KW A55R REDUCTASE 

TERMINAL 

DIHYDROPTERIDINE. 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins. 


BL00524A 9.65 £.776e- 
14 52-73 


1580 


PD02894 

* 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21,96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins. 


BL00411C 15.04 5.292e^~~ 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PK006Q4 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 | 


1584 


Pb'006^1 


BTB (also knovn as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1S85 


DM01551 


Jew OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- " 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE! 
II ORF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1S87 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7 . 955e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
Tit DUrtnfti'jr ii ao 

AhD rKUUU /6V< 11. * £ 

2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 

318 PR00072G 10.45 
5.304e-19 433-450 

PR0D072F fi 87 S q"»s#*- 

15 332-349 


1589 


BL00191 


Cytochrome bS family, 
heme -binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 61-113 BL00191K 

442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.9* 6.62Se- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


lb92 


BL00037 


Myb DMA-binding domain 
proteins repeat proteins 
proteins. 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
OL0J037A 16.68 3.250e- 
12 31-55 BL00037B 
13<7£ ^.bzoe-11 64-90 
BL00037C 16.86 9.654e- 

1 A 1 4C_1 CA 


1595 


BL00028 


Zinc finger, C2H2 type, 

domain d ro tains 


BL00028 16.07 l.S14e- 

V9 iiU 14 / 


1598 


PF00628 


PHD-f inger . 


PF00628 15.84 3.250e- 


"1599 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


1600 


BL00518 


(RING finger), proteins. 


nf.rifi^i o' Vo "i-i d Clio 
10 30-39 


1602 


BL00412 


proteins . 


10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


1607 


BL00252 


and delta family 
proteins • 


23 20-57 BL00252B 

19 7fl C) lOqi.lC CQ.i no 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


"16-11 


BL00904 


Protein 

c**jr 4, alio L cL does ct i-j^lld 

subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
1.47 6.018e-09 127-168 


1612 


PF00168- 


C2 domain proteins. 


PF00168C 27.49 3..250e- 
09 365-391 


"1613 ■ 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412D 16.54 6.051e- " 
09 932-983 BL00412D 

1 C CA H icio no on 
lb. 39 /.lbJe-03 9JJ- 

984 


1614 


BL005S9 


Eukaryotic molybdopterin 
oxidoreduc tases 
proteins . 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0G559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANS FERAS E BI . 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








" 472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3 12 7 48£e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BLO03O3 


S-100/lCaBP type calcium 
binding protein. 


BL00303B 2^.15' 7.7S0e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD018B8 


PEPTIDE REDUCTASE 
PROTEIN METHI. 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 

«1 i3D / iUUUc*JU X c. 3 - 

155 PD01888A 12.64 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 

1 C<1 A cafio no CQ"2 nna 

A.30 *.DOVe _ vlif OJ/-/US 

PR0O239E 1.58 4.580e- 
09 702-714 PR00239E 


.1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 

.1.4/fie — 14 41*31 

PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8 . 027e- 
11 77-95 


1626 


BL0Q325 


Act in-depolymeri zing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BLG0325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
pruccins ♦ 


BL00064B 23.57 l.OOOe- 
40 82-130 3LO0064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
1 . ujue.-4u z^j-^7b 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 

91 1 £ 1 nn.flo_*a"i 00 en 
ax.id i.uuue-jj *<j-oU 

BL00064D 14.19 6.500e- 

3i ifl9-9i9 

J J. IQfi" «li{ 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11.71 1.614e-09 34-59 


1634 


PR0023^ 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 1.105e- 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins . 


BL01210B 13.92 9.53le- 
10 133-103 


1637 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQS 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR0001ES 


GRAM -POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
254 PR00320A 16.74 
1.659e-09 279-294 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00320A 16.74 2.098e- 
09 229-244 


1642 


PF00023 


AnJc repeat proteins. 


PF00023A 16.03 6.464e- 
09 114-130 


1643 


PRO 016 9 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


16"44 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5,737e-09 528-539 


1645 


BL01108 


Ribosoraal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


xM 


PRO0&6 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6.308e-l8 "386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


16*47 


DM01242 


3 THREONINE- -TRNA 

LIGASE. 


kJVi V X « 1 « V» X / >13 9 . /7lc 

37 340-381 DM01242E 
23.00 5 071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 B.054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 | 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 6.720e- 
11 431-485 


16^52 


BL00933 


FGGY family ot 
carbohydrate kinases 
proteins . 
i 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


16-53 


BL00795 


Involucrin proteins. 


BL007S5C 17. 06* 2.988e- 
10 70-115 


1*54 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 302-334 


1655 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 282-314 


1656 


BL00741 


Guanine - nucleot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391b- 
16 607-630 


1*57 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BL00972 | 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
466 


1660 


BL00406 


Act ins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


i4<Si 


PR00105 


CYTOSINE-SPECIFIC DNA 
ME THYLTRANS FERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BLO028O 24.6'i 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
( TRANSDUC IN ) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.2O0e-19 70-85 
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SEQ ID NO:" 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL0O018 


EF-hand calcium-binding 
domain proteins. 


BL0001B 7.41 5.050e-10 
489-502 


1667 


rUUXUbo 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 




wubi/wu^/sun family 
proteins. 


BL01153D 19.69 1.188e- 
17 115-141 BL01153C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
, 10 13-37 


1671 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 


PR00678H 9.13 3.100e- 
10 1146-1169 


1672 


BL00596 


Chromo domain proteins. 


" BL00598 14.45 8.500e- " 
20 27-49 


1673 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- " 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-358 PR30049D 
0.00 1.286e-10 342-357 


1676 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR0074 7H 12.76 8.636e- 
19 427-448 PR00747G 
14 .50 2.286e-16 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15,23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PRO0747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


16B0 




Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL0D678 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp-Asp (WD) repeat 
^fcULciiio proteins . 


BL00678 9.67 4.£C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


Pk00326 


GTPl/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PR00646 1 


RDC1 ORPHAN RECEPTOR 

ttTPUtt'll 1UB* 

o J- Uixrt 1 UKa 


PRtiO^H rf.3* 4.188e- 
09 755-771 


1690 
1591 


BLOll^O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 7S-129 




TR00456 


RlBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 


1692 


PRd04£6" 


RlBOSOMAL PROTEIN P2 
SIGNATURE 


PR0O456E 3.06 7.28le- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SBQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR0Q466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 




Zinc linger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL0D028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BL00028 
16.07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.26 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR0D109 


TYROSINE KINASE! 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


fc* KUUU1 b 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR0O019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL011$9 


WW/repS/KWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


r v v w o si * 


C> AliVJ £ x nyc IT L*Xtt-L-X3-v.- 

x3-H type (and similar). 


?r0ub42 11.59 9.550e- 
11 230-241 


1714 


PF00642 


uxw*— tingEi u-Xg - k."X3-v. 

x3-H type (and similar) . 


f£UUb42 11. 53 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 

lri^A^ ti ran nrnhaina 

jkJi.ui.ciJi i. ail pi. ULc J-iia . 


BL01115A 10.22 7.129e- 
no *7 _ ci 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.01Be- 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BL0O038 " " * 


M\rf — h 1 Vial i v_ 1 /\/w\ 

riyt.-cype, nexix- xoop- 
helix* dimerization 
domain proteins. 


BLU003SB 16,97 c.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-426 


1724 


BL01279 


Protein-L- 
ieoaspartate (D- 
aspartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calciun-binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


i'730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 296-350 


1732 


BL01KJ6 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family . 


PF00850F 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.850e-20 177- 
201 PF00850E 8.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-502 


1743 


PR00449 


TRANSFORM INC3 PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 l.lBBe- . 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 B.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acyl trans f erases 
ChoActaee / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.1S8e- 
11 4-20 


1751 


PD00O66 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
130 


1753 


BL01013 


Oxysterol - binding 
protein family proteins. 


BL01013D 26.81 6.516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 2B7-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.750e- 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e- ~ 
09 224-278 


1765 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


177 6 


BU)0942 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- " 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 " ■ 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 

a q corba t e - depende n t 

monooxygenases proteins. 


BL00084D 25.11 3.700e- 
20 169-224 BL00064B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-15B 


1779 


BLO1013 


Cocysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.881e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.l38e- 
13 492-515 


1784 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; postion of 
signature in amino acid sequence. 
TRADOCS: 14 1 6223.1 (%CRJ0l t.DOC) 
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TABLE 4 



SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


2 




immunoglobulin domain 


2.1e-32 


109.5 


3 "'" 


pkinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 




Zinc finger, C2H2 type 


1.6e-21 


84.9 


5 




Fibronectin type III domain 


0 


1097.1 




in3 


Fibronectin type III domain 


0 


1035.0 


7 


IliJ 


Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 ~ 


Q 


TBC 


TBC domain 


4e-40 




10 


p4 50 


Cytochrome P450 


9.5e-l7 


62.0 


12 


ank 


Ank repeat 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


15 


Zf-MYND 


MYND finger 


1.3e-06 


35.4 


16 


zf-MVND 


MYND finger 


1.3e-06 


35.4 


17 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-99 


343.9 


18 


CAP_GLY 


CAP-Gly domain 


1.2e-25 


98.7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1.6e-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 .3e-102 


3S2.6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2.4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8.4e-74 


258.6 


25 


RNA_j>ol A 


RNA polymerase alpha subunit 


0 


1077.7 


26 


Clq " 


Clq domain 


1.9e-10 


44.4 


27 


Ribosomal_L2 

3 


Ribosomal protein L23 


7. 8e-32 


111.2 


28 


Ribosotnal_L2 
3 


Ribosomal protein L23 


le-29 


104 .2 


30 


Zf-A20 


A20-like zinc finger 


1 .5e-10 


48.5 


31 


zf -A20 


A20-like zinc ringer 


1 .5e-l0 


48.5 


32 


FMN dh 


FMN -dependent dehydrogenase 


5.4e-179 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain <PTB/plD) 


3.6e-59 


209.9 


35 




Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


immunoglobulin domain 


1.4e-13 


48 .8 


40 


kinesin 


Kinesin motor domain 


6.7e-76 


265.6 


44 


Ets 


Ets -domain 


1.4e-56 


182.1 


45 


Eta 


Ets-domain 


1.4e-56 


182.1 


46 


LRR 


Leucine Ricn Repeat 


1.7e-13 


58.3 


48 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-162 


552.8 


49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 


50 

~5i 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


31 


UCH- 2 


Ubiquitin carboxyl -terminal 
hydrolase family 


l.le-26 


102.0 " 


52 


ras 


Ras family 


8.5e-45 


162.3 




PRK 


tfnosphoribuiokinase 


2.1e-65 


230.7 


54 


myb_DNA- 
bindi ng 


Myb-like DNA-binding domain 


0.096 


15.2 


-55 

Tc~ 


voltage_CLC 


voltage gated chloride channels 


3.3e-186 


631.9 


DO 


sugar_tr 


Sugar {and other) transporter 


0.00015 


-64 .3 


57 


TBC 


AOL UUUlctJ.Il 


2 .2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-2S 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/claudin family 


7.9e-49 


175.6 


68 ' " 


C2 " 


C2 domain 


7.9e-S4 


192 .2 


69 


C2 


C2 domain 


2 .3e~S4 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 


a-g 


Immunoglobulin domain 


8.2e-28 


94 .7 


73 


pkinase 


eukaryotic protein kinase 


8e-69 


242,1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 
domain 


p-value 


PFAM 
SCORE 


74 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 8e-3 B 


i An c 


76 


C4_Topoisom 


Topoisomerase DNA binding C4 

zinc fing 


5.4e-54 


192.8 


83 
84 
86 
88 


Peptidase^ 9 

fn3 

SH2 

ig 


Prolyl oligopeptidase family 
Fibronectin type HI domain 
Src homology domain 2 
Immunoglobulin domain " 


4 .3e-10 
4 . le- 51 
3.1e-22 
0 . 0091 


3b\8 
183 2 
^7.7 
14 . 0 


09 
92 
93 
95 


WD40 

laminin G 
~~AMP -binding 
pkinase 


wd domain, G-beta repeat 
Laminin G domain "" 
~ AMP -binding enzyme 

Eukaryotic protein kinase 
domain 


2.le-21 
^.le-27 
2 . 4e-l3 
i.4e-59 


84.6 
98.5 
-J / . 2 
211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2.6e-51 


183 .9 


97 
9B 


adh_short 
kinesin 


short chain dehydrogenase 
Kinesin motor domain 




217 . 5 


101 


IRS 


PTE* domain (IRS-1 type) 


2.2e-86 
5.4e-36 


300.4 
133.0 


102 


AAA 


ATPases associated with various 
cellular act 


6.8e-0S 


-5.2 


104 

106 "~~ 


pkinase 
ras 


Eukaryotic protein kinase 

domain 

Ras family 


2.7e-73 
8.3e-24 


256.9 
92.5 


107 
108 

109 


FYVE 

Cyt_reductas 
e 

zf-C2H2 


FYVE zinc finger 
reductase 

Zinc finger, C2H2 type 


S.4e-27 
7 . 7e-6l 


100.7 
215.5 


~Tl3 ' 
116 


pkinase 
PH 


uunnt,yvti^ ^luucni jvinase 

domain 

PH domain 


2 .3e-122 
4e-88 


420.0 
306.2 


117 


lipocalin 


Lipocalin / cytooolic fatty- 
acid binding pr 


3.1e-ll 
2.4e-14 


45.2 
53.5 


118 
120 


pkinase 
WD4 0 


Eukaryotic protein kinase 
domain 

WD domain, G-beta repeat 


4 »5e-20 
2 .4e-14 


76.3 
61.1 


121 
123 

124 " 
"127 " 


WD40 

ltF5_elF4 elP 
2 

ig 

ml to carr 


Wd domain, G-beta repeat 
eIF4-gamma/eIF5/elF2-epsilon 

Immunoglobulin domain 
Mitochondrial carrier proteins 


2 .4e-14 
le-32 

6.5e-08 


61.1 
122.2 

30.6 1 


128 
129 

13 0 
133 


PP2C - 
ATP1G1 PLM M 
AT8 j 
ptkB 

AUdr 


Protein phosphatase 2C 
ATP1G1/PLM/MAT8 tamily 

pfkB family carbohydrate kinase 
Acyl CoA binding protein 


3e-l6 
2 .2e-71 
3 . le-20 

4 .Se-42 
4,6e-22 


58.6 
250.6 
80 . 6 

1*7.1 
86.7 


134 

135 
136 


rrm 
IQ 

ATP1G1_PLM_M 
AT8 


RNA recognition motif. 

IQ calmodul in-binding motif 

ATP1G1/PLM/MAT8 family ' " 


1.2e-31 
2 .6e-08 
9.3e-22 


118.5 

41.0 

BS.7 


139 
"140 " 


WH2 

zt-C2H2 


wiskott Aidrich syndrome 

homology region 2 

Zinc finger, C2H2 type ~ 


0.0067 


23.1 


141 

143 
146 


Peptidase S2 
6 

ar£ 
KKAB 


Signal peptidase I 

AD^-ribosyiation factor ramilv ' 
KRAB box 


1.7e-82 

R Ta_ 1 n 
3 ♦ ' «S — 1U 

1.2e-39 


287.5 
35 . 7 

145.2 


14 8 
149 

TBT 


DVF& 
PDEase 

S4 


integral membrane protein DUF6 

3' b' -cyclic nucleotide - ~" 

phosphodiesterase 
i>4 domain 


7.3e-30 

0.096 

3.8e-80 

l.le-08 


112 .6 

8.0 

231.1 

42.3 


153 
154 

T55— ; 

"il7 ; 


tRNA-synt_ld 
£yt_reductas 

cas j 
ictiri ' 3 


:RNA synthetases class I <R) 
fAD/NAD-binding Cytochrome 
reductase 
*as tamily 

Vctin s 


3.8e-103 
7.8e-60 : 

3-6e-28 : 
J.8e-26 f 


356.1 
212.2 

L07. 0 
}7.1 | 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


159 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24 . 9 


160 


Zn_ca rbopep t 


Zinc carboxypeptidase 


5e-138 


471 . 9 


165 


picinaee 


Eukaryotic proteir. kinase 
domain 


5.1e-$7 


236.1 


167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27.0 


ICO 

lO 0 


Ribosomal_Sl 


Ribosomal protein SI 5 


1 . le-06 


29 . 0 


J. O _/ 


nwAn 


DcJMJ/ uitAH «ox nencace 


le-48 


157 . 0 


171 


DUF59 


Domain of unknown function 

nTTT?CQ 

Uura? 


0.07 


-17.4 


X I & 


pjlXIidSc 


Eukaryotic protein kinase 

auutaXJl 


3 . 7e-15 


58 . fa 


173 


globin 


Globin 


4.6e-18 


67.4 


X l*k 


WW 


ww aomaxn 


7 .3e-06 


32 . 9 


x to 


ras 


Ras family 


le-31 


118 . 8 


178 


ATP1G1_PLM_M 


ATP1G1/PLM/MAT8 family 


2.5e-17 


71.0 


X 




Zinc finger, C2H2 type 


1 . 5e-95 


344 . 2 


180 


Clq 


Clq domain 


B.8e-72 


251.9 


190 


Yjphosphatas 
e 


Protein- tyrosine phosphatase 


4 .9e-287 


967 .0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285 .6 


194 


bromodomain 


Bromodomain 


5.8e-31 


111 .4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 




227.1 




DnaJ 


DnaJ domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0. 00018 


16.9 


200 


acid_phospha 
t 


Histidine acid phosphatase 


2 .5e-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.OO048 


26.9 


2 04 


vATP- 
synt__AC3 9 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 




vATP- 
sync J\\ — j y 


ATP synthase (C/AC39) subunit 


1.6e-139 


476 .9 


206" 


ldl_recept a 


Low-density lipoprotein 
receptor domain 


2 .4e-25 


97 .6 


209 


&nk 


Ank repeat 


1.4e-19 


78 . 4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


Til 


Clq 


Clq domain 


1.6e-70 


247.7 


~5T5 


con 


Ubi qui tin -conjugating enzyme 


7.4e-74 


258 .8 


« X J 


UQ_con 


Ubiquit in -conjugating enzyme 


le-53 


191.9 






DEAD/ DE AH box nel lease 


1 . 8e-43 


140.4 


216 


PMP22_Claudi 


PMP-22/EMP/MP20/Claudin family 


4.5e~21 


83.4 


218 


Glycoe^trans 


Glycosyl transferases 


4e-2i 


83.6 


219 \ 




Immunoglobulin domain 


0 . 092 


10 . 7 


222 ' " 




WD domain, G-beta repeat 


7 . 4e-23 


89.4 


"2224 


~TPR .'" 


TPR Domain 


1 . 2e-08 


42 . 1 


22S 


G 


DnaJ central domain (4 repeats) 


1 . 5e-38 


141.5 


226 


Ui.ClVJ V_AAV,AUA 

G 


unau cencrai aomaxn (4 repeats) 


1 . 5e-38 


141 .5 




HSP70 


Hsp70 protein 


2.4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


•J » 4C"4 / 


J. / W . c, 


231 


tsp_l 


Thrombospondin type l domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4 .6e-144 


492.0 


234 


ras 


Ras family 


4.8e-S0 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 [ 


LRR 


Leucine Rich Repeat 


6.7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAM 3 


uciolkji Jr 1 ±\JSi 


p-value 


PFAM 
SCORE 


244 


dCMP cvt d<=»a 
m 


v-yuiuine ana aeoxycyciayiace 

deaminase 


2 . 5e-05 


31 . 1 


245 


lQf 


Tnifni in/*irr 1 i^Wi il in rj^vn^ « — 


6 .7e-08 


30 .5 


248 


wnt 


wijv. lauijLxy 01 oeveiopmencaA 

signaling protei 


9 . le-270 


742 . 6 


250 


mi to care ' 


ni LULiluilUL lal Ldi ricr wj.OC6jLIiS 


1 . 3e-5E> 


193 . 6 


254 


a deny 1 a t exin 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation efflu 

X 


wtion Citiux xanruiy 


2 . 8e-33 


124 . 0 


256 


SH3 


on j uuinaXii 


3 . 9e-14 


60 .4 


257 




transporter probein 


2 . 6e-52 


187.2 


258 


adenvla tekin 
ase 


rtuciiyjLaue IvinaSc 


2 . le-110 


380.2 


2*9 


HIT 


HIT family 


8.2e-07 


25.3 " 


260 " 


Bacterial PQ 
Q ~ 




1 . 6e-15 


65 .0 


262 


proteasome 


Proteasome A- type and B-type 


6.5e~64 


225.7 


267 


pkinase 


ciuxaryotic protein Kinase 


6 . 3e-27 


101.0 


270 




Intermediate filament proteins 


3 . 2e-150 


512 , 5 


271 


se 


Choline/ethanolamme kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 3e-77 


269.9 


ieo 


WD4 0 


WD domain/ G-beta repeat 


7 . 8e-73 


255.4 


281 


WD40 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


284 


zf -DHHC 


unn^ zinc ringer domain 


4 . 6e-24 


93 .4 


287 




Exonuclease 


1 .4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


294 


Zl>C2H2 


«mc ringer, C2H2 type 


1 .4e-29 


111.7 


29£ 


z£-C2H2 


4inc ringer, type 


2.2e-125 


430.0 


2d£ 


mi to carr 


Mitochondrial carrier proteins 


4 . le-59 


205.5 


297 


HMG_box 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


fil vron ^ra«n 
u^jfv>wo 1. & aim 

f 4 


Glycosyl transferase 


5e-87 


302.5" 


304 


tRNA- svnt ^ 


tRNA synthetases class II (D, K 
and N) 


l.le-84 


294.8 


305 


KRAB 


1MUUJ UUA 


2e-44 


161. 0 


306 




j. ci,uyii j. l J.011 mouir . 


2 . 7e-44 


160 .6 


308 


7tm_l 


' »»x ctllollicuusx allc (cCcptOr 

(rhodonsin familvl 


5 . 2e-39 


126 .1 


^309 


DNA__polyrae ra 
seX 


DNA Dolvmeranp X f »m{ 1 v 


2 . 4e-64 


227 . 2 


311 


F-box ~ 




9 . 5e-08 


39 . 2 


312 


ig 


Immunoglobulin domain 


6.8e-19 


65.9 


313 


Ete 


Et S - doma in 


8 . le-60 


192 . 3 


315 


Kelch 


Kelch motif ■ ■ 


1 . 3e-106 


367 . 6 


317 


art 


4iiJubyidv,ioii xaCLor ramiiy 


3 . 2e-35 


130 .4 


318 


sugar cr 


su y ai \anu acnen transporter 


0 . 0003 


-73 . 1 


320 


pkinase 


fWAmyuLit, proLcin Kinase 
domain 


8 . le-83 


288 . 6 


322 


pkinase 


auAatyoLic protein Kinase 
domain 


4 . 9e-81 


282 .6 


324 


xiinK 


Extracellular link domain 


4 .5e-143 


331 . S 


"326 


ARID 


ARID DNA binding domain 


5.1o-37 


136.4 


327 


HMG box 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


8.1e-8i 


281.9 


331 


chromo 


' chromo • ( CHRroma tin 
Organization Modifier) 


4e-18 


£6.7 


~333 " ' 


Peptidase M2 
2 


Glycoprotease family 


1.2e-l36 


467.4 
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NO: 


c e Aim JNAMCj 


Jjr.£>L.KlPT10N 


p~ value 


PFAM 
SCORE 


33S 


wa 


von Willebrand factor type A 
domain 


2 .3e-07 


37.9 


339 




Ras family 


7.8e-07 


-59.1 


340 


zf -C2H2 


Zinc finger, C2H2 type 


8 . 2e-64 


225.4 


342 


2 f - C2H2 


Zinc finger, C2H2 type 


2 .4e-85 


297.0 


343 




Immunoglobulin domain 


0 . 0005 


18.0 


346 


T3ki naqp 


nuKaryo zxc procein Kinase 
uomoixn 


6 . 5e-65 


229.1 


347 


pkinase 


uuw^yotiL procein Kinase 
domain 


6 . 5e-65 


229.1 


351 


"EGF 


EGF-like domain 


8.5e-20 


79.2 


352 


ank 




2 . 5e-101 


350.0 [ 


354 


TBC 




5 , le-15 


63 . 3 


355 


PHD 




3 . 2e-07 


37.4 


358' 


DUF6 "' 


iuueyiai inemorane procein UUrb 


0 . 033 


15. 8 


359 


zf -C2H2 


« inc linger, L^n^ type 


7.4e-20 


79.4 


361 




ftiiN repeat 


6 .6e~34 


126.1 


362 




rucacive i^iF-ase activating 

JJiULSin lor Ail 


4 .7e-53 


189.7 


363 


ef hand 


ot nana 


5 ,4e-10 


46.6 


367 


LRR • 


Leucine Rich Repeat 


8.8e-44 


158.9 


368 


1 ami n i n 


Lam in in 6 domain 


1 .5e-33 


121.7 


369 


PP2C 


Protein phosphatase 2C 


5 ,3e-20 


73.9 


3 72 


LIM 


LIM domain containing proteins 


9.9e-15 


57.1 | 


373" 




KKAB DOX 


4.8e-23 


90.0 . 


3 76 




Ion transport protein 


2.9e-09 


-4.2 


3 77 




JBeige/BKACH domain 


4 .9e-208 


704.5 


380 ' 




Eukaryotic protein kinase 
domain 


1.6e-94 


327.5 


381 


rti jit winuiiig 


AMP-binding enzyme 


1.4e-07 


-140.3 


382 


HECT — 


HECT- domain (ubiquitin- 

LranSCBraSe/ , 


1.3e-07 


-13.5 


384 " 


ank 


Ank repeat 


2.5e-101 


350.0 


386 




Immunoglobulin domain 


9.5e-Q6 


23.6 


388 


zf -C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154.6 


389 




Immunoglobulin domain 


2.8e-15 


54.3 


390 


U 1 J. LU Ldi X 


Mitochondrial carrier proteins 


3.5e-67 


233.2 


392 


TPR 


xfn. uomam 


6.1e-17 


69.7 


393 


SH3 


ori j QOuuin 


3.5e-09 


43.9 


394 


"AAA " 


ATPases associated with various 

Art 1 1 ill av a n H 


4.1e-21 


83.6 


396 


spectrin 


ofjccui in repeac 


2.1e-67 


23 7 . 3 


397 


zf -C2H2 


Zinc £ inCTAir ^vna 

^^ua^c.*, \~&t\£. type 


0 .0066 


23 . 1 


399 


fn3 


Pihyftn^r^in ttt r)rtnn-in 

r^uiuiicccin type 111 domain 


4 . le-102 


352 .6 


400 


WD40 


hlJ domain R-hoha yonoa t- 
• •*-» uuiiiai.ii , O uc La repent 


0 . 00049 


26.8 


401 


El_dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


cioroneccin cype 111 domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2 .le-10 


48 .0 


405 




L.aajierin aomam 


8.ie-81 ! 


281.9 


406 


zi-cxxc 


^aal zinc ringer 


5e-15 


63.4 


410 


RhoGEF 


KnoGbr domain 


l.le-23 


92.1 


411 


F-box 


F-box domain . 


4 .2e-06 


33.7 


412 


SNF2 N 


ovtvz ana ocners n- terminal 

drtma i n 

vAkJl [Id J. H 


5.8e-16 


61.6 


415 


CPSaae_L_cha 
in 


carbamoyl -phosphate synthase 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93.6 1 


419 


DENN 


uavivi iAe.A-a/ domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8 . le-43 


133 . / 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G -patch 


Q-patch domain 


Ie-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24.6 


427 


plexin_repea j 


Plexin repeat 


).0023 


24.6 
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SEQ ID 
NO: 


PFAM NAMF 




p -value 


PFAM 
SCORE 




t 








429 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 


8.6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase 


le-^6 


214.6 


432 


SH3 


otw uorns in 


3 ,4e-16 


S7.2 


433 


GTP CDC 


Cell division protein 


2.1e-H4 


393.5 


436 


Collagen 


Collagen triple helix repeat 
(20 copies) 


4 .6e-194 


658.1 


438 


Ricin B lf»ct- 
In 


Similarity to lectin domain of 
ricin b 


0 . 0085 


10.5 


441 


Alpha adapt i 
n_C 


Alpha adapt in carboxyl -terminal 
domai 


1.2e-256 


866.0 


442 


Alpha adapti 
n_C 


Alpha adaptin carboxyl- terminal 
domai 


1.8e-235 


795.7 


443 


PD2 


PDZ domain (Also known as DHR 
or GLGF) . 


1.9e-65 


230.9 


445 

446 
451 


LOU 

ig 

sushi 


ATP -dependent protease La (LONJ 
domain 

Immunoglobulin domain 


0 . 00012 
0 . 00011 


-17.1 ~ 
20.1 


452 
454 

456 


fn3 

pyridoxal de 
C 

kinesin 


oucajia domain (pLX repeat) 
Fibronectin type III domain 
Pyridoxal -dependent 
decarboxylase conse 
Kinesin motor domain 


1. 4e-18 
1 .Se-06 
8 ,3e-l4 


75.2 
35.2 
50.3 


4 57 
458 


neur_chan 
Josephin 


Neurotransmitter-gafced ion- 
channel 
Josephin 


4 . 36-217 
le-175 

0.0002 


734.4 
597.1 

18.7 


468 
470 

471 


bZIP 

NTP_transfer 

ase 

WD40 


bZIP transcription factor 
Nucleotidyl transferase 

WD domain, G-beta repeat 


1.7e-07 
6.3e-0* 

2e-2B 


31.8 
~-2S.3 - 

107.9 


4 73 
477 

473 


LIM 

zf-RanBP 

WD40 


LIM domain containing proteins 
Zn- finger in Ran binding 
protein and others. 
WD domain, G-beta repeat 


0.00021 
0.028 

6 .5e-18 


20.7 
21.0 

73 . 0 


480 
481 

} 485 


KRAB 
ArfGap 

SH2 


KRAB box 

Putative GTP-ase activating 

protein for Arf 

Src homology domain 2 


le-31 
B.4e-66 

0.011 


118.8 
232.0 

11 .4 


486 
487 

489 


Clq 
dsrm 

Zf-C2H2 


Clq domain 

Double- stranded RNA binding 
motif 

Zinc finger, C2K2 type 


4 .3e-74 
l.le-47 


"259.6 

171.9 


490 

492 - 


Alpha_adapti 
n C 

Ski 


Alpha adaptin carboxyl -terminal 
domai 

Shikimate kinase 


4.8e-153 
3.4e-222 


521.9 
751.6 


497 


ENVjpolyprot 
ein 1 


ENV polyprotein (coat 
polyp rotein] 


1.2e-10 
2.6e-22 


48.8 
"77.6 


498 

500 
501 


abhydrolase " 
2 

rrm T 
WW 


ruua^aoiipase/ carooxyi est erase 

RNA recognition motif. 
WW domain 


0.041 

5.4e-34 
4.6e-18 


-48.1 ~ 

126.4 
73.4 


502 
504 
505 


*g 

abhydrolase 
vwa 


Immunoglobulin domain 
dipnd/ oeca nyaroiase cold 
von Willebrand factor type A 


l.le-10 

0.045 

7.1e-62 


39.5 
-3 . £ 
219.0 


508 
509 


wa_K_ATPase 
C ~" 

Bxonuclease 


Na+/K+ ATPase C- terminus : 

Exonuclease 


2.3e-145 


496.3 " 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1.3e-56 
2 • 9s - 06 


201.5 
27.0 


511 ■ - 


Glycos trans 
f 1 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
fl 


slycosyl transferases group 1 


1.9e-09 


38.5 


514 


pro isomeras 

e i 


zyclophilin type peptidyl- 
arolyl cis-tr 


1.8e-63 


221.4 
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SEQ ID 
NO ; 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


515 


EGF 




1. 9e-18 


74 . 7 


"£16 




Quits mnHi i "1 a 


4 .3e-38 


140 . 0 


523 




Immunoglobulin domain 


3 .3e-06 


25.0 


526 


UBX 




1 . le-34 


128 . 6 


528 


adh_zinc 


Zinc-binding dehydrogenases 


2.7e-34 


127.4 


530 


SAM 


SAM domain (Sterile alpha 


0.046 


10.0 


531 


adh short 


short chain dehydrogenase 


0.0025 


-34 .1 


532 


mltO Cell" IT 


Mitochondrial carrier proteins 


2.5e-81- 


281 , 7 


"533 


raxto carr 


Mitochondrial carrier proteins 


2e-6l 


213.5 


~*34* 




Thxolase 


3 .5e-183 


622 .0 


535 


FMO-like 


Flavin -binding monooxygenase- 

IIKC 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-55 


196\ i 


53 7 


D VI J\ r-> i n— k- 1 

cKAA-sync l 


tRNA synthetases class I (I, L, 
m ana vj 


3.1e-136 


466.0 


538 




tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466.0 


539 


c a- syn t_i 


cxwa synthetases class I (I, L, 
M and V) 


1.9e-117 


403.6 


540 


utdNH-sync x 


tRNA synthetases class I (I, L, 
n ana vj 


3.1e-136 


466.0 


541 


v^llr"- sync IS 


ATP synthase (E/31 kDa) sub unit 


5.9e-B5 


295. 7 


543 


Zf -C2H2 


Zinc finger, C2H2 type 


5.5e-69 


242.6 


"544 - 


DUF101 


Protein of unknown function 
DUF101 


8.5e-38 


139.0 


545 


TGFbjpropept 

•LUC 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD4 0 


WD domain, G-beta repeat 


2.6e-32 


120.8 


548 


KriJJ 


Kex homology domain (RHD) . 


1.6e-238 


£86.2 


549 


MfcAD UPI31 

MMK HoKl 


GTPase of unknown function 


5.4e-67 


236.0 


551 


HECT 


HECT-domain (ubiguitin- 
transferase) . 


4.3e-127 


435.6 


554 


MHC__ix_aipha 


Class II histocompatibility 
antigen, alp 


3.5e-74 


259.8 


555 




Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


556 


Kelch 


Kelch raotif 


5.5e-29 


109.7 


561 




AWf-omaing enzyme 


2.8e-06 


-163.7 


562 


PABP 


Poly- adenylate binding protein/ 
unique domai 


4.9e-38 


139.8 


564 " 




Gag P30 core shell protein 


1.2e-67 


238.2 


566 


PWWP i 


rwwr aomaxn 


8.1e-l6 


66.0 


567 


SCAN 


awuv domain 


7.3e-68 


23 8.9 


569 




EuJcaryotic protein kinase 
doma xn 


1.5e-84 


294.3 


'570 


okinase 


cuKaryocic protein kinase 
domain 


1.5e-84 


294.3 


571 




Carbon- nitrogen hydrolase 


0. 00081 


-79.7 


$72 




uyubin neau \ mo cor domain/ 


0 


1495.2 


573 


myosin_head 


Myosin head (motor domain) 


0 


1490.4 


575 




*utp uiouu x e 


1.7e-23 


91 .5 


576 




ouxp muuuxe 


1 .7e-23 


91.5 


577 


DNA_pol B 


£t\jxy nits £. ah*s lamiiy u 


0 


1138.6 


$78 


PDZ 


vut* aomaxn iaxso Known as DHR 


8 .3e-09 


42.7 


579 


LRR 


ijcucine Kxcn Kepeac 


4 ,9e-21 


83,3 


580 


neur chan 


i'cui otransmxccer-gaced ion- 
channel 


5.9e-177 


601.3 


583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


dead/ DE AH box helicase 


7.3e-36 


116.3 


586 


KH- domain ! 


kh domain - . . - 


2.9e-13 


57.5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


LlM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


broraodomain 


sromodomain 


6.6e-32 


114.7 j 


591 


broraodomain 


Broraodomain 


6.6e-32 


114.7 | 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


592 


normone_rec 


Ligand- binding domain of 
nuclear hormone 


3.5e-22 


87.1 


593 


PHD 


PHD- finger 


3.8e-12 


53.8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342.7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319.2 


597 


WD40 


WD domain, G-bata repeat 


0.00054 


26.7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2.3e-86 


300.4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 ■ 


606 


PWWP 


PWWP domain 


2.6e-28 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP_GLY 


CAP-Gly domain 


0.0046 


20.1 


615 


RFX_DNA_bind 
ing 


RFX DNA-binding domain 
i 


5.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


l.le-81 


284.8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


27B.5 


618 


2f -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13.1 


620 


MATH 


MATH domain 


7.8e-0S 


22.2 


621 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinaee 


Eukaryotic protein kinase 
domain 


4 .4e-40 


146.6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


raolybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1.4e-12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleo tide-binding 
domain 


3 ,7e-58 


206.6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf-C2H2 


Zinc finger, C2H2 type 


2. le-88 


307.1 


£32 


rrm 


RNA recognition motif. 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork head 


Fork head domain 


5.9e-27 


103.0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3.8e-70 


246.5 


642 


TPR 


TPR Domain 


4,8e-08 


40.1 


643 


efhand 


EF hand 


1.9e-27 


104.6 


647 


SNF2_N 


SNF2 and others N- terminal 
domain 


1.2e-10l 


351.1 


64 8 


PseudoU synt 
h_2 


RNA paeudouridylate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0.0087 


22.7 


651 


ank 


Ank repeat 


~1.3e-iY 


11.9 


652 


I_LWgQ 


I/LWEQ domain 


9. 5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4.1e-171 


581.8 


654 


tsp_l 


Thrombospondin type l domain 


4.1e-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76.2 




C2 


C2 domain 


6.7e-19 


76 .2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


£67 


GST | 


Glutathione S- transferases. 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


671 


I_Lt* E Q 


I/LWEQ domain 


9.5e-101 


341.0 


£72 


ABC_tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4 .8e-24 


93.3 
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SEQ ID 

NO; 


PFAM NAME 


DESCRIPTION " 


- 

p-value 


PFAM 
SCORE 


675 


WD40 


WD domain, G-beta repeat 


4 . Be- 24 


93 . 3 


676 


LRR 


Leucine Rich Repeat 


0.0015 


25.2 


679 


zr-CCCH 


Zinc f incrpi* r- , rfi_r , -vc;_r , _vi ~u 
type 


2 . 6e-29 


107 . 7 


680 


zt-C2H2 


zinc finger, C2H2 type 


5.2e-05 


30.1 


681 


CH 


Caloonln homoloav (CH> dnnuin 


2 . 4e-17 


71 . 1 


682 


" DSPc 


Dual speciticitv Dhosnhatasp 
catalytic doma 


A 1 a A 1 


— icg c 

156 . 6 


683 


zi'-C3HC4 


Zinc linger. C3HC4 tvne ^RIng 
finger) 


w • UjI 


10 . 8 


687 


Synapsin 


Synapsin 


""0 


i a on a 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


o 




691 


homeobox 


Homeobox domain 


8 . 5e-30 


112 . 4 


696 


Peptidase M2 
4 


metallopeptidase family M24 




Tin c 


697 


RhoGEF 


RhoGEF domain 


9 . 5e-3 5 




698 


PHD 


PHD- finger 


0.008 


9 . 3 


701 


' zf-C2H2 — 


Zinc finger, C2H2 type 


-> - JC'i^J 


422.0 


702 


Sulf atase 


Sulifatase 


3e-231 


781.6 


703 


z£-C2H2 


Zinc ringer, C2H2 type 


5 . 7e-20 


79.8 j 


707 


Acyl_transf 


Acyl transferase domain 


1 . le-22 


88.8 


708 


WD4 0 


WD domain, G-beta repeat 


4 . 8e-19 


76 . 7 


710 


Ran_BPl 


RanBPi domain. — — ^— — 


8 . 4e-06 


-7.3 


713 


DEAD 


DRAD/DEAH box heiicase 


9.9e-42 


134.9 


714 


PH 


PH domain ~~ " 


1 . 6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 

catalvfcier doma 


1.5e-37 


138.2 


717 


Sialyltransf 




7 .5e-31 


115.9 i 


718 






le-29 


100.8 


719 


integnn_B 


Integnns, beta chain 


0 


1125.4 j 


720 


zt'-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase__C2 


calpain family cysteine 
protease 


3e-145 


495.9 


723 


ig 


Immunoglobulin domain 


2 . 2e-05 


22.4 


724 


F-box 


F-box domain. 


0 .007 


23.0 


725 
726- 


Nop 
Nop 


Putative SnoRNA h* nd-inrr rtr-tmrsk r> 

^^vwk.Avc oti\jtsjMn u.-. jivixxig domain 
Putative snoRNA binding domain 


8 . le-58 


205. 5 


727 


WD40 


WD domain, G-beta repeat 


8. le-58 
7.5e-26 


205.5 
99.3 


730 


derm 


motif 


0.027 


12.1 


731 


dynamin 


Dynamin family 


4 .2e-16 


66.9 


733 


zf-CCCH 


Zinc f lnarer C~xB-c-x5-n--jr'* Z'u — 
type 


2 . 8e-10 


41,7 


"735 1 - 
738 j 


CDP- —— 
OH_P_trangf 
DEAD T 


CDP- alcohol 

phosphatidyl transferase 
DEAD/DEAH box heiicase 


4.2e-26 
8 « 6e- 57 


100.1 
182 . 5 


739 
742 
743 


TSC22 
ras 

PMl_typeI 


TSC-22/dipybun ramily 
Ras family 

Phosphomannosc isomerase type I 


6.5e-32 
9 9a- i nn 

1.2e-243 


119.5 
346.9 
822.9 


747 
748 

749 


trypsin 
kazal 

ernand 


Trypsin 

Kazal-type serine protease 
inhibitor domain 
kF hand 


6 . 4 e- 8 8 
2.2e-S2 

6.3e-06 


279.4 
187.4 

33.1 


751 
752 


Zf-C2H2 


PHD- finger " 

zinc finger, C2H2 type 


4.9e-16 
3.2e-21 


66.7 
83 . 9 




Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754 


Ribosomal L3 
9 


Ribosomal L39 protein 


0.00018 


26.7 


755 
758 


PH ' 
SCAN 


PH domain 

SCAN domain 


3.6e-14 


55.7 


759 
~76"0 


PA 
ar£ 


pa domain 

RDP-ribosyiation factor family 


1.4e-53 
0.006S 


191.5 
23.1 


761 


CIDE-N 


CIDE-N domain " 


2.2e-l9 
2.2e-40 


77.8 

147.6 ~" 
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SEQ ID 
NO : 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


762 


histone 


Core his tone H2A/H2B/H3/H4 


9.9e-53 


188.6 


763 


zl-MYND 


MYND finger 


4.1e-l4 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188.6 


/© / 


vwc 


von Willebrand faccor type C 
domain 


2.9e-34 


127 .3 




^ V\ T-l y-3 


EF hand 


4 .8e-ll 


50.1 


770" " 




Zinc finger, C4 type (two 
domains) 


2 .4e-53 


181.6 


772 




Has family 


7e-90 


312.0 


773 


Ql 1 1 Fi» H ra no 
5U1 idk aSS 


buiracase 


le-142 


487.5 


775 




Zinc finger, C2H2 type 


l.le-12 


55.5 


l (D 


zt -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


111 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2.1e-32 


121.1 


779 

i 
i 

' ifln 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


1.5e-76 


236.6 


/ oU 


spectrin 


Spectrin repeat 


3 .7e-29 


110.3 


781 


mito carr 


Mitochondrial carrier proteins 


4 .6e-57 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) , 


4 ,le-07 


37.1 


•joe 

too 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21.7 




ras 


Ras family 


5.3e-39 


143.0 


787 


RNase HII 


Ribonuclcase HII 


2.5e-67 


237.1 


790 


PI3_PI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5.4e-108 


372.2 


"70c 


cadherin 


Cadherin domain 


2.5e-40 


"147.4 - 


/y© 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


797 


trypsin 


Trypsin 


9.9e-20 


64.8 


799 


CH 


Calponin homology (CH) domain 


3.7e-15 


63.8 


601 


Gal- 

bind_JLectin 


vertebrate galactoside-binding 
lectin 


4.1e-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


"26.1 




TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


1.8e-26 


101.4 


808 


CN_hydrolase 


Carbon- nitrogen hydrolase 


8.8e-80 


278.5 


811 


CBFD.NFYB HM 
F ~ 


His tone-like transcription 
factor 


fie- 14 


59.8 




adh_ahort 


short chain dehydrogenase 


B.le-20 


79.3 


814 


IMP4 


Domain of unknown function 


3.3e-71 


250.0 


815 


ZI-C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


B16 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


1.6B-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74 .3 


826 


IF5_e2F4 elF 
2 


eIF4 - gamma/ e I F5/e I F2 - epsi 1 on 


1.6e-32 


121.5 


830 


Arttaap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


Oil 


LRR 


Leucine Rich Repeat 


2.1e-26 


101.1 


832 


laminin_EGF 


Lammin EGF-like (Domains III 
and V) 


2e-57 


204.2 ' 


839 


rrtn 


RNA recognition motif. 


1.3e-22 


88.5 


840 


Y_phosphatas 
e 


Protein -tyrosine phosphatase 


2.6o-119 


409! 8 


"841 " 

■ - 


pkinase 


Eukaryotic protein kinase 
domain 


3 .4e-l00 


346.3 




Rioosomai L2 
2e 


Rioosomai L22e protein family 


le-64 


228.4 


846 


IBR 


IBR domain 


?e-15 


62.5 


849 


z£-C3HC4 


sine finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


z£-C3HC4 


Zinc finger, C3HC4' type (RlNG " 
finger) 


0.00016 1 " 


18.9 


851 


SET 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine- 


0 j 1025.4 | 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 






853 


" SRCR 


Scavenger receptor cysteine - 
rich domain 


0 


1025.4 


857 


lactamase B 


rietai io-Dcca- lactamase 

QllDPTf am"! 1 \i 


0 .012 


-6 . 0 


858 


COX6A 


CvtOChlTOTtie C Oififtasi* cnhnni h 

Via 


3 .4e-58 


206 . 7 


B59 


rrm 


RNA recognition motif. 


5.4e-4S 


162.9 


861 


PRK 


Phosp hor ibu 1 ok i nas e 


5 . le- 62 


219 . 4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.9e-53 


185.5 


864 






4 . 7e-158 


538.5 


866 




Immunoglobulin domain 


4e-12 


44.1 


867 


zt -C2H2 


ttinc ringer, L<cnz type 


7e-135 


461.5 


~872 




Core hlstone H2A/H2B/H3/H4 


4 .9e-41 


149.8 


874 


CPSase_L_cha 


Carbamoyl -phosphate synthase 


2.1e-218 


739.0 


879 


IvlDDoOIUal ol 

2e 


Ribosomal protein S12e 


2.ie-98 


340.3 


B82 


serpln 


Serpins (serine protease 

ltuuwicors; 


2.5e-42 


145.7 


883 


Patatin 


Patatin 


1.2e-5l 


182.0 


884 


RA 


Kas association (RalGDS/AF-6) 
oumain 


0 . 044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2 .7e-12 


54.3 


ot5y sugar_tr 


Sugar (and other) transporter 


8 .2e-63 


222.1 




DUF28 


Domain of unknown function 
DUF2B 


1.3e-43 


158.3 


896 


IP_trans 


r iiu^piid. ^ lay i inositol transfer 


6 .5e-98 


338 .7 


898 


DEAD 


ucirMJ/ unnci cox nencase 


1. 5e-48 


156 .5 


899 


KE2 


wi« iotmiy protein 


7e-61 


215.7 


900 


KE2 


icmuiy proL6in 


4 .3e-51 


183 . 2 


901 


2f-C2H2 


£»xut. Linger, un^ type 


2 . 7e-57 


203 .8 


902 


ras 


Ras family 


2.3e-75 


263.8 


904 
"906* 


TPR 




3 . 2e-22 


87.2 




GBP 


vju«wiyj.ace-Dinaing protein 


8 . 9e-253 


B53 .1 


907 


GBP 


vaudiiyAcx Uc~DlHOing protein 


1 . le-239 


809.6 


908 


WD40 


wl* uuniain # v*-oeca repeat 


2 . 6e-26 


100.8 


909 


PH 


PH domain 


1.3e-09 


39.4 


910 


2f-C2H2 


linger, L4nc ^VPO 


2 . 5e-39 


144.1 


913 


Epimerase 


NAD dependent 

e^iiucxaac / uenyatoCa8e ramily 


5e-07 


-88.5 


921 


TBC 


TRP Hnma 4 n 

id^ uuniain 


1 . 5e-09 


30.7 


922 


WD40 


WD domain, G-beta repeat 


l.£e-25 


9B.2 


923 


WD46 


WD domain, G-beta repeat 


8.2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
uyuroiasw 


2.9e-05 


29.1 


925 


UQ con 


Ubiguitin-conjugating enzyme 


0.60033 


-27.6 


926 


CH 


taiponm norooiogy (CH) domain 


3 .3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5 . 9e-48 


172.7 


929 


ZI-C3HC4 


Zinc finger, C3HC4 type (RING 


3.1e-10 


37.4 


93 0 


Ribul_P_3_ep 
ira 


Ribulose -phosphate 3 epimerase 
family 


7 .2e-105 


361,8 


931 


Ribul P 3 ep 
ira 


Ribulose -phosphate 3 epimerase 
f ami 1 \r 


1.2e-96 


334.4 


936 


C2 


UWlllcillJ 


2 .'Ae-62 


220 .7 


937 


"KAP^ramily 


Nucieosome assembly protein 
(NAP) 


l.le-22 


84. £ 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


948 


pkinase 


Eukaryotic protein kinase " 
domain 


3 .4e-75 


263.2 


949 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyitransferase 


1.6e-07 


38.4 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


951 


SAM 


SAM domain (Sterile alpha 


0.014 


14.5 


9^4 


GPO IDk Mnr-A 


UAiaoreauctase laroixy 


i-.3e-li 


52.0 


955 


BTB 


uiu/ rwo UVIUC* All 


7e-22 


86.1 


"956 


BTB 


3TB /POZ rtnma ^ n 


7e-22 


86 .1 


957 


CDP- 

OH P transf 


CDP- alcohol 


0.053 


-22.2 


959 


ras 


Pag fflmi lv 


2 .4e-97 


336 . 8 


960 


ras 




8 .4e-43 


155 . 5 


961 


Acetyltransf 


Acetyltransferase (GNAT) family 


1.2e-08 


42.2 


962 




short chain dehydrogenase 


2 .4e-3l 


117 .6 


'963 


mutr 


Bacterial mutT protein 


5.6e-06 


26.2 


969 


IF-2B 


Initiation factor 2 subunit 
family 


8 .4e-193 


653.9 


970 


RNase PH 


3' exoribonuclease family 


9e-24 


92.4 


"975 


WW — 


ww domain 


5.7e-25 


96.4 


977 


PDZ """ — ' 


PDZ domain (Also known as DHR 
or GLGF) . 


3.6e-2l 


83.7 


978 


7 


Ribosomal protein L17 


2.4e-20 


81.0 


979 




LIM domain containing proteins 


5 ,8e-42 


152.8 


980 


Calsequestri 
n 


Calsequestrin 


1.7e-297 


1001.7 


982 


HSP7Q 


tiep^o/aipna crystailin family 


l.2e-10 


43.2 


'983 


oxidored_c[6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4.8e-63 


222.9 •- 


988"""' 


TBC 


tbc domain 


2.2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 


993 


t RNA_in t__ en a 
o 


tRNA intron endonuclease 


0.0017 


-34.2 


994 


homeobox 


Homeobox domain 


4e-18 


73.6 


997 


pyr_redox 


Pyridine nucleotide- di sulphide 
oxidoreducta 


0.012 


11.6 


1000 


luxuvi C car IT 


Mitochondrial carrier proteins 


9.7e~123 


421.2 


"Tooi 


~ra 1 


Kas association (RalGDS/AF-6) 
domain 


1.2e-15 


65.4 


"1004 


• DUF8 1 


Domain of unknown function 
DUF81 


0. 099 


10.2 


'1005 


actin 


>\.c tin 


1 ,3e~174 


574.3 


1006 


actin 


Actin 


3 .le-130 


428.6 ™" 


1007 


CDn£0 TCP! 


icr-i/cpnbu cnaperonin family 


3 .7e-195 


661.8 


1008 


TPR 


i fts. uomain 


8 .le-44 


159.0 f 


1009 


zf -C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1011 j 


z£-C2H2 


4inc ringer, C2H2 type 


3 ,6e-61 


216.6 


1012 


Zf -C3HC4 


*±nc tinger, l,jhC4 type (RING 
finger) 


4 .7e-15 


53.1 


1016 


tRNA-synt 2c 


tRNA synthetases class II (A) 


2 .3e-15 


55.2 


1018 


RhoGAP 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphogly cerate mutase family 


3 .8e-18 


69.7 


1026 


HMG box 


nri\j inign moDiiity group) box 


8 .4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 




Ubiqui tin- conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 


0.028 


TT.3 


1034 


Hydrolase "~ ~ 


naioacia aenaiogenase-like 
hydrolase 


2e-21 


84.6 


1037 ~- 


KRAB 


XvcvrvD OOX 


4.8e-06 


32.4 


1038 


Cationjjfflu 

X 


Cation erf lux family *~ 1 


7.1e-42 


152 .5" 


1040 


ART 


NAD : arginine ADP- 
ribosyl transf erase 


4.7e-47 


169.1 


1042 


WU40 


WD domain, G-beta repeat 


I.9e-18 


74 . 7 


1043 


z£-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 . 7 


1045 


lectin_c 


Lectin C-type domain 


1.9e-28 


108.0 


1046 


Glucosamine^ 
ieo 


Glucosamine- 6 -phosphate 
isomerase 


0.00013 


-25 . 1 
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SEQ ID 
NO: 


PFAM NAME 




p-value 


PFAM 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4.5e-ao 


279.4 


1043 


ig 




1 . 7e-09 


35 .6 


1050 


Ribosomal L2 
4e 




2e-33 


124 . 5 


1054 


Amidase 


Amidase 


4 . 3e~ 152 


518 . 7 


1055 


rrm 


rna recognition motif. 


j - oe-<£b 


100 . 3 


1058 


annexin 


Annexin 


6 . 9e-44 


159 . 2 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/claudin ^nmilv 




-23 .6 


1060 


h,omeobox 


Homeobox domain ~~ 




117 . 2 


1062 


Acyl transfer 
ase 


Acyl transferase 




10 . 5 


1064 


" AMP-binding 


AMP-binding en2yme ' 


o . oc xuu 


345.3 


1065 


LRR 


Leucine Rich Repeat 


O • jc AS 


60.6 


1066 


GTP1_0BG 


GTPl/OBG family 




141 . 8 


1071 


ig 


Immunoglobulin domain 




159 . 1 


1072 


PHD 


PHD- finger 


6.8e-07 


36.3 


1074 


DENN 


DENN (AEX-3 ) domain 


8 . 3e-33 


121 . 5 


1075 


SCP 




4 . 7e-41 


149 .8 


1077 


OLF 


01 fact omedin- like domain 


2 .2e-66 


234 .0 


107B 


mito carr 


i*A\_viv,injiJv*x. idx vannex proceins 


le-42 


149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6.2e-45 


162.7 


1007 


START 


Jinftl UUllldlu 


1 .5e-48 


174 .7 


1093 


DSPC 


Dual specificity phosphatase, • 
tauenycic uoina 


3.3e-6"3 


"223.4 


1094 


"dSHPx 


(jiucacnione peroxidases 


9 . 6e-41 


148 .8 


"1095 


DUF25 


Domain of unknown function 

HTTP 9 5 


2e-75 


264.0 


1096 


. DUF25 


Domain of unknown rune t ion 
DUF25 


6e-75 


262 .4 


"iios 


se 


JNicroreauctase family 


1.3e-13 


58.6 


1106 


PTE 


riiuBpnocriestexase tamily 


1 . 3e-179 


610.1 


1107 


DAGKc 


Diacylgiycerol kinase catalytic 

uuiiia Lit 


0 . 00049 


19.6 


1109 


ras 


Ras family 


1.3e-15 


40.7 


1115 




rutative vjif-ase activating 

^lui>ciii *-<JIT nil. 


9 .7e-47 


168.7 


1116 


HM014 17 


HMG14 and HMG17 


4 .4e-21 


83.5 


1117 


HMG14 17 


fUiuii dxiu nPiLix / 


9 .9e-12 


52.4 


1119 


e 


rumdiyiacecoacetaLe (FAAJ 

hvdml 9RR f am 


2e-83 


290.6 


1120 


pkinase 


DUAcnyuuic protein Kinase 
domain 


1 ,4e-94 


327.6 ~~ 


1123 


abhydrolase 


alpha/beta hydrolase fold 


9.2e-23 


89. 0 


1129 


pro 1 some ras 
e 


v.yi_iy^iiiia.n u ype pep ciay jl- 
prolyl cis-tr 


2 ,2e-56 


197.1 


1131 


DnaJ 




1.6e-30 j 


114 . 9 


1132 


WD40 




1 . 3e- 19 


78 . 6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64.9 


1134 


PH 




0 .0015 


17 . 8 


1136 


Adap comp eu 
b 


Adaptor complexes medium 

subuni t f Ami' 1 v 


1.2e-256 


666.0 


1137 


Adap comp su 
b 


subuni t family 


2 . 5e- 209 


708 . 8 


1139 


ras 


Ras family 


1.5e-86 


301.0 


1141 


pklnase 


tiuAaiyuLii, protein Kinase 
domain 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1 .2e-05 


29 . 9 


1153 


IRS 


ptb domain (IRS-I.type) 


5.4e-55 


196.1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 


106.9 


"1157 


Asparaginase 
2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC oxred 


&MC oxidoreductases 


4 . 7e-142 


485.3 


1160 


z£-ANl 


ANl-like Zinc finger 


0. 00021 


27.9 
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SEQ ID 
NO : 


PFAM NAME 

-HIT— 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


linker histo 


linker his tone Hi and H5 family 


3.8e-14 


60 .4 


1164 


DED 


ucdtn ectector domain 


3 .9e-05 


30 .5 


1165 


IRS 


no aouiain iiko - i type) 


2.6e-43 


157.3 


1166 


IRS 


c id aomain \ x Ko X type J 


2 ,6e-43 


157,3 


1168 


SAM 


SAM domain (Sterile alpha 
mot if) 


0.04 


10.5 


1170 




aipna/DBta nyoroiase cold 


0 . 090 


-7.5 j 


1174 


SAP 


SAP domain 


3 .9e-10 


47 .1 


1177 




Protein phosphatase 2C 


5.3e-31 


112.5 


1178 


WD40 


WD domain, G-beta repeat 


4 . 7e-35 


129.9 


1180 


Ets 


Ets-domain 


1.8e-09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
copies; 


0. 00016 


24.7 


1182 




TCLl/MTCPl family 


9.5e-56 


198.6 


1184 




Ra sGBF doma i n 


1.7e-88 


307.4 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


\Jcnl\ Li Z O 


u-tfAK/Ly-6 domain 


0.0042 


15.6 


1188 


Orn_DAP_Arg_ 


Pyr i doxa 1 - dependent 
decarboxylase 


6.2e-128 


430.6 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314.0 




Stathmin 


Stathmin family 


1 .8e-90 


314 .0 




Seel 


Seel family 


3.2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide-disulphide 
oxidoreducta 


3.1e-32 


lll.B 


1197 


Glyco_transf 
8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 




adh_short 


short chain dehydrogenase 


8.3e-45 


162.3 




un i e^jne t ny 1 1 


ubiE/C0Q5 methyl transferase 
family 


1.3e-121 


417.4 


1208 


7tm 3 


7 transmembrane receptor 


7.2e-09 


29.0 


1209 




Ank repeat 


3.9e-15 


63.7 


1210 


vATP~ """ 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 ~~ 


1212 




zinc tlnger, C2H2 type 


5.5e-17 


69.9 


1213 




nit nana 


3.2e-07 


37.4 


1219 




rna recognition motif. 


2.1e-40 


147.7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 




SCAN domain 


1.5e-71 


251.1 


1223 


G- gamma 


GGLj domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 






px domain 


2.2e-lS 


64.5 


1233 


px! 1 ■ 


PX domain 


2.2e-15 [ 


64.5 


1236 




Fes/ClP4 homology domain 


3.3e-09 


44.0 


1241 


repuidase — M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 ■ " 


upfoooS 


Metalloenzyme of unknown 
xunccion UfrUOUb 


6.3e-61 


215.8 


1248 


Gxycos trans 
f 2 "~ 


Glycosyl transferases 


4.5e-10 


46.9 


1249 ' 


ethand 


EF hand 


4e-ll 


50.4 


1254"" 


UQ_con 


Ubi qui tin -conjugating enzyme 


2.le-73 


257.3 


12S5 " 

i oqg 


ras 


Ras family 


2.2e-62 


220.7 




formyl_ trans 
i 


Formyl transferase 


4.9e-30 


108.3 


1259 


zf-C3HC4 


* >inc linger, t,3HC4 type (RING 
finger) 


5 .3e-13 


46.4 


1261 


DiHtolate re 
d 


Dihydrofolate reductase 


2.le-69 


241.7 


1262 


G_giu_transp 
ept 


Gamma- glutamyl transpeptidase 


1.8e-110 


380.4 " 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat J 


4.2e-22 


86.9 



259 



WO 01/53312 



PCT/USOO/34263 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


6e-29 


108.0 


1267 


K_tetra 


K+ channel tetramerisation 
domain 


2.8e-27 


104.0 


1269 


ras 


Ras family 


1.3e-BS 


297.9 


1275 


zf-C3HC4 

- 


Zinc finger, C3HC4 type (RING 
finger) 


4 .2e-10 


37.0 


12 76 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.8 


1277 


abhydrolase 


alpha /beta hydrolase fold 


5.6e-21 


83.1 


1279 


trypsin 


Trypsin 


4.4e-41 


132.0 


1280 


PBP 


Phosphatidylethanolamine- 
binding protein 


1.3e-13 


58.7 


1285 


Zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Ank repeat 


1.7e-52 


1B7.8 


1294 


f n3 


Fibronectin type III domain 


0.026 


20.9 


1295 


GDP 


Guanylate -binding protein 


0.00026 


-70.0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/ciaudin family 


6.9e-4l 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3.2e-14 


"6-0.7 


1298 


LIM 


LIM domain containing proteins 


5.8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


1307 


mi Co carr 


Mitochondrial carrier proteins 


2.le-S3 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


1.6e-17 


71.6 


13 io 


ttPAR LY6 


u-PAR/Ly-6 domain 


7.1e-20 


75.5 


1313 


thiored 


Thioredoxin 


3.6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


l.Se-67 


237.9 


1316 


trypsin 


Trypsin 


4.4e-41 


"132.0 


1320 


RibosomalJLl 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
9 


Armadillo/beta-catenin-like 
repeats 


0.0054 


23.4 


1328 


KRAB 


KRAB box 


0.052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 r 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.014 


"-1.6 


1331 


PX 


PX domain 


2.1e-10 


48.0 


1333 


KRAB 


KRAB box 


l,8e-36 


134. 6~ 


1334 


UPP_syntheta 
ee 


Putative undecaprenyl 
diphosphate synt 


2.3e-B9 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1.8e-59 


211.0 


1336 

1 TT* ' 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


1.2e-31 


118.6 


U J 7 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2 ,3e-12 


S4.5 


1 J J o 


TPR 


TPR Domain 


0.00021 


28.1 




metal thlo 


Metal lothionein 


0.013 


20.3 




mutT 


Bacterial mutT protein 


5. 8e-09 


36.5 




Band 41 


FERM domain (Band 4.1 family) 


1.3e-38 ! 


122.5 


1344 


Kelch 


Kelch motif 


1.4e-44 


161.5 




Antifreeze 


Antifreeze protein 


I.2e-10 


48.8 


1347 


3Beta_HSD 


3 -beta hydroxys teroid 
dehydrogenase/ieomera 


0.086 


-177.2 




BTB 


BTB/POZ domain 


5.3e-28 


106.5 1 




DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.1 


1352 


Nramp 


Natural resistance-associated 


1 ,2e-202 


686.6 


1353 


s_ioo 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 


209.0 


13S6 


C2 


C2 domain 


2.4e-15 


64.4 


13*7 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203 .1 


1360 


z£-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HKG17 


7.9S-40 


145.7 
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SBQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


13 61 


£> i£> 


SIS domain 


3.8e-30 


113 . 6 


1363" 




bib domain 


1.3e-28 


108.5 


1364 




Immunoglobulin domain 


0.00026 


19.0 


1366 




K+ channel tetramerisation 
domain 


1 .le-16 


68.9 


1371 




Collagen triple helix repeat 

l^v COpiC3/ 


2 .2e~113 


390.1 


1372 


DnaJ 


Dna^T domain 


6.6e-36 


132.7 


1376 


KRAB 


KRAB box 


2.16-38 


141.0 


1378 


ELM 2 


B.IM2 domain 


2e-23 


91.3 


1366 


thiored 


Thioredoxin 


i.2e-23 


82.8 


1381 


ank 


Ank repeat 


2.3e-83 


290.4 


1382 


OlO 


htu/poz aomam 


3e-ll 


50.8 


13 83 


WD40 


WD domain, G-beta repeat 


1 .6e-19 


78.3 


1384 


WD4 0 


WD domain, G-beta repeat 


6.3e-24 


92.9 


1387 




Zinc finger, C3HC4 type (RING 
finger) 


i.ie-09 


35.6 


1389 




Zinc finger, C2H2 type 


5.5e-50 


179.5 


1390 


2 f ~C2H2 ~" 


Zinc finger, C2H2 type 


2.5e-85 


296.9 


1393 


JVilitiS in 


Kincsin motor domain 


T.8e-188 


637.4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1^ 9ft 


KRAB 


KRAB box 


5.1e-22 


86.6 


1402 




bZIP transcription factor 


0.035 


13.1 


1405 


sugar_tr 


Sugar (and other) transporter 


0.003 


-101.5 


14 06 


RhoGAP 


RhoGAP domain 


8.9e-47 


168.8 


i A f\n 


rrm 


RNA recognition motif. 


le-35 


132.1 


1408' 


LRR 


Leucine Rich Repeat 


2.1e-13 


~$8.0 


n a no 


N ebu l i n_r epe 
at 


tfebulin repeat 


6e-54 


192.6 


1410 




Ank repeat 


1.6"e-17 


71.4 • 


I'll* 


Ribosomal_LS 


ribosomal L5P family C- terminus 


8.2e-58 


205.5 


1415 


trypsin 


Trypsin 


4 .7e-85 


.270.4 


1416 


aminotran 1 


Aminotransferases class- I 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1.6e-C7 


33.1 


1419 


WD40 


WD domain, G-beta repeat 


2.2e-09 


44.6 




cadherin 


Cadherin domain 


8.3e-42 


152. 3 




SH3 


SH3 domain 


2.5e-B0 


280.3 




PHD 


PHD- finger 


3.2e-17 


70.6 


14 


PHD 


phd- finger 


3 .2e-17 


70.6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-$i 


13B.8 


1428 


helicase_ C 


Helicases conserved C- terminal 
domain 


le-26 


102.2 


1429 




WD domain, G-beta repeat 


3,96-07 


37.2 


1430 




inositol monophosphatase family 


2.5e-10 


40.2 


1431 




Mitochondrial carrier proteins 


4.3e-83 


287.7 


1433 




Clq domain 


2.9e-16 


66.2 


1434 


WD40 


WD domain, G-beta repeat 


1.6e-13 


58.3 


1435 


mos—i— 
P synth 


Myo-inositol-1 -phosphate 
synthase 


7e-228 


770.4 


1436 




RNA recognition motif. 


1 .4e-34 


128.3 


1438 


ia 


Immunoglobulin domain 


1.3e-12 [ 


45.6 


1440 


Q Adanfc CT 


Gamma -adapt in, C- terminus 


3.4e-67 


236.7 


1441 




Gamma- adapt in, C- terminus 


3 .4e-67 


236.7 


1443 


Kelch 


Keicn motit 


0.00013 


28.7 


1446 


ARID 


akiu UNA binding domain 


1.8e-21 


84.7 


1447 


zf-C2H2 


zinc finger, C2H2 type 


9.4e-28 


105.6 


1448 


AMP-binding " 


AMP-binding enzyme 


2 , 6e - 07 


-14^.1 


1451 


rrm 


RNA recognition motif. 


6.5e-2l 


82 .9 


1454 ' 


IS 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


sialyl trans f 


Sialyltransferase family 


5.4e-21 


83 .2 


1460 


Aldose eplm 


Aldose l-epimerase 


1.9e-3$ 


131.2 


1461 


C2 


C2 domain T 


4e-18 


73 .6 


1470 


TIG 


XFTyTiG domain 


3 .le-19 


77.3 


1472 


PseudoU_synt 


RNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 
NO: 


f*'AM NAME 
A 2 


DESCRIPTION 


p-value 


PFAM 
SCORE 




1474 
1475 


DENN 

Cation_ef f lu 

X 


DENN (AEX-3) domain 

Cation efflux family 


1.3e-44 
4.6e-49 


161.6 
176.4 




1477 


TBC 


TBC domain 


8e-47 


169 . 0 




1478 
1480 


rrm 
ig 


RNA recognition motif. 


2e-2l 


84.6 




1484 
1485 


Telo_bind_al 
pha 

2f-C2H2 


i mmuiiog ioou i in aoiiickJLll 

Telomere -binding protein alpha 

siihiin i 

Zinc finger, C2H2 type 


5 .5e-06 
0.028 


24.3 
-225.9 




1486 


pkinase 


6u\<nyouAu procein Kinase 
domain 


l.Be-68 
9.5e-13 


240.9 
49.9 




1468 


he li case C 


neiicases conscrvea C- terminal 
domain 


1.4e-15 


65.2 




14B9 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 




1496 


ECH " 


Enoyl-CoA hydra t as e/isomerase 


5.2e-41 


149.7 




1491 


c 


naenyiaue ana buanyiate cyclase 
catalyt 


5.9e-46 


166.1 




1492 


LRR 




3 . 4e-19 


77.2 




1495 


zr-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.1B-10 


36.3 




1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 




1500 


SH3 


SH3 domain 


9.3e-05 


27.2 ~ 


1502 
1503 


homeobox 
homeobox 


Homeobox domain 
Homeobox domain 


0.084 
0.084 


13.Q 
13.8 




1505 
1506 


~EGF 
UCH-2 


EGF-like domain 

Ubiquitin carboxyl- terminal 

hwHwl aoa f nm^ lit 

iiyui cjj.t4se rarni ly 


2 .7e-23 
2 .7e-21 


90 .8 
84 ,2 




1508 

1511 
1512 


Peptidase M2 
0 

PX 

Sulfatase 


PX domain 
Sulfatase 


2 . Be-28 
1.9e-ll 


101.8 
51.5 




1516 
1518 


Syntaxin 
amlnotran_3 


Aminotransferaoeo class- III 
pyridoxal -pho 


2 . 8e-35 
0 . 011 
9.7e-106 


130.7 
-62.3 
305.6 




|_1520 


ig 


Immunoglobulin domain 


0.075 


11.0 


1521 

1523 
1528 


RA 

RhoGAP 
WD4 0 


Ras association lRalGDS/AF-6 J 
domain 

i\nuvjrvf wlV/HICliXi 

WD domain, G-beta repeat 


0.613 

2 . 5e-05 
5.4e-24 


13 .3 

10. 7 
93.1 


1535 
1538 
1539 

"1540 "- 


IMS 

FYVE 

DAGKc 

ocular_alb 


iiiui-o/ oauiD larnny 
FYVE zinc finger 

Dlacylglycerol kinase catalytic 
domain 


7 , 8e-95 
3 ,2e-27 
6e-07 


328.5 
i'01.5 
36.5 


1653 


SAP T 


Ocular albinism type 1 protein 
SAP domain 


0 

6e-06 


1184.7 
33 .2 


1654 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1655 
1656 


Amino_oxidas 
e 

RhoGEF 


Flavin containing amine oxidase 
RhoGEF domain 


3.2e-43 
1.4e-24 


157.0 

95. 1 


1657 
"16^ 

1660 


MMR HSR1~ 
act in 


v*a*to»*s oi unKnown lunCCion 

Ub"iquitin carboxyl -terminal 

hydrolase family 

Actin 


0 . 0011 
2.5e-ll 

6.6e-21 


-45.5 
51.1 

69.9 


1661 
1662 

1663 


fcSAH 

vwa 
WD40 


BAH domain 

von Willebrand factor type A 
domain 

WD domain, G-beta repeat 


1.7e-B2 
0 

1.4e-67 


287 . 5 
1909.4 

237.9 


1667 
"1669 * 

1671 


zf-C2H2 

NoIl_Nop2_Su 

n 

5H2 


zinc finger, C2H2 type 
NOLl7NOP2/sun family 

src homology domain 2 


r.3e-93 
1.3e-23 

5,4e-l5 


324 .4 
84.3 

46.9 
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SEQ ID 

NO: 


PFAM NAME 


WOO rl XKJvi 


p-value 


PFAM 
SCORE 


1672 




Oroanization MQdifi*»Y*\ 


2 .le-18 


67 . 7 


1674 


Zf-CCCH 


type 


n nno c 


17 . 6 


1676 


Glyco hydro 
47 


Glycosyl hydrolase family 4 7 


1 fl*»- 1 ft*? 

j> - oe x o / 


act ~ 
boo * 2 


1677 


Glyco_hydro 
47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105. S 


1681 


WD40 


WD domain, G-beta repeat 


1 1 a . *5 "7 


105 . 5 


1683 


MMR_HSR1 


GTPase of unknown t unction 


1.8e-78 


274.1 


1691 


rrm 




" ~"i — a^TTv 


137 . 9 


1692 


rrm 




± . oe- j / 


137 . 9 


16S3 


AAA 


ATPaSeS aflSnri AhpH uij-k vari'rtnc 

cellular act 


l .3e-8l 


284 . 5 


1697 


Pe rr i c_r e due 
t 


Ferric reductase like 
transmembrane com 


-— — — — 

8 . 4e- 82 


285 . 2 


1698 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 




190 . 1 


1699 


zf-C2H2 






4 . 4e- 3 4 


126 . 6 


1700 " 


ar£ 


ADP-ribosvlat ^ on factor famiiu 




75 . 8 


1702 


GTP EFTU 


Eloncrafcion factor Tn fam-Slvr 


0 . 014 


11 .4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194 .4 


1707 


pkinaae 


Bu\a^yui<l(. protein KluaSc 

domain 


1 . 2e- 88 


307 . 9 


1709 


WD4 0 


"•w i-iwuicixji, o~oeca repeac 


0 . 0035 


24 .0 


1710 


LRR 




1 . 2e-30 


115. 3 


1711 


WW 


WW doma i n 


7 . 6e-12 


52. 8 


1712 


ank 




4 . 2e-34 


126 . 7 


1713 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 

tvnp 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zxnc finger C-x8-C-x5-C-x3-rt 


2.6e-09 


38.3 


His 


ras 


Ras family 


4.4e-41 


149.9 


1718 


HMG box 


nnvj uwoiiiLy group J DOX 


8 . 3e-21 


82 .6 


1719 


Ftbc 




1 . le-45 


165.2 


1721 


KLH 


iicixA-iugp»nenx JNA-Dinoing 
domain 


9. 2e-10 


45.9 


1723 


darm 


Double- stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 


RrnaAD 


dimethyl aees 


0.045 


9.2 


172S 


CIDE-ti 


CIDE-N domain 


5.9e-4Q | 


146 . 2 


1725 


HAT 


HAT (Half -A-TPR i -prion t-R 


2 . 9e-44 


160 . 5 


1728 


efhand 


EF hand 




79.9 j 


1733 


Hist deacety 
1 


Histone deacetvlaflp f ami iv "~ 




360.6 


1735 


LRR 


Leucine Rich Repeat 




4 . 6e-34 


126. . € 


1739 


PI-PLC-X 


Phosphatidylinoeitol- specif ic 
phosphol ipase 




16 . 1 


1743 


ras 


Kas family 


3.7e-10 


-31 T. 


1744 


ras 


Ras family 


3 . 7e- 10 


-4 X . o 


1745 


RaeGEF 


RasGEF domain 


3.2e-49 


X / O . 7 


1746 


adh_short 


short chain dehydrogenase 


7 . le- 08 




17S1 


zf-C2H2 


zinc finger, C2H2 type 


9e-2& ~ 


14^ . ^ 


1754 


fn3 


Fibronectin type III domain 




J* O . 7 


175^ 


zt-C2H2 


Zinc finger, C2H2 type 




■a n »> i 


1758 


rrra 


RNA recognition motif. 






1760 


Nop 


Putative snoRNA Finding domain 


6.ie-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


176"5 


MMR HSR1 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CNjiydrolase 


carbon- nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 


OxysterolJBP 


oxysterol -binding protein 


4.7e-56 


199.6 


1783 J 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 


1784 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 j 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS: 1 4 1 6227. 1 (%CRN0 1 !.DOC) 
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TABLE 5 



SEO ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MavC /(JfiVT MUM 


Means imkan 


1 


1-21 


0 .991 


0 . 955 


2 


1-31 


0 . 995 


0 . 944 


3 


1-33 


0 .949 


0.736 


"4 


1-19 


0.970 


0 . 951 


5 


1-26 


0 .971 


0.863 


6 


1-26 


0 .971 


0 . 863 


7 


1-26 


0 . 971 


0.863 


8 


1-24 


0 . 971 


0 - 863 


9 




0.982 


0 .901 


10 


1-21 


0 . 991 


0 . 955 


11 


1-23 


0.989 


0 . 699 


12 


1-25 


0.955 


0 . 803 


^13 


1-18 


0.932 


ft end 


14 


1-18 


0.938 


0 . 876 


15 


1-25 


0 . 941 


rt oil 
U . o XX 


16 


1-17" 


0 . 972 


0.939 


17 


1-27 


0.964 


0 . 777 


18 


1-16 


0.914 


0.657 


19 


1-19 


v . 3D J 


0.840 


20 


1-20 


0 y35 


u . /U J. 


21 


1-22 ""■ 


0 . 974 


ft QCft 


22 


"1-33 


0 . 9(Ji 


0 . 895 


"23 


1-19 


0 . 991 


0 . 959 


24 


1-31 


0 QQ^ 


0 . 944 


25 


1-22 


n of e 
u . a f d 


0 . 935 


26" 


1-27 


0 . 996 


0 . 928 


27 


1-24 




0 . 739 


28 


1-21 




0.688 


29 


1-31 


n oftg 

u . yoo 


0.841 


30 


1-28 


noon 


0 . 893 


31 


1-19 


u . yy j 


0 . 976 


32 


1-22 


A QQO 
V . J JO 


0.909 ( 


35 


1-33 


n QiiQ 


0.736 




1-33 




u . 7 J b 


46 


1-19 


0 . 570 


0 . 951 


67 


1-25 


0 . 968 


0 . 848 


71 


1-18 


0 . 949 


U . OH 3 


72 


1-30 


0 .991 


~n — qTo " 

u . y iy 


75 


1-29 


0 . 958 




88 


1-20 


0 . 986 


U . 3 


94 


1-33 


0 . 994 


u . ys J 


97 


1-46 


0 . 964 


n cqc 


103 


1-49 


0.983 


Q C7fl 


108 


1-26 


0.978 


0 Q DC " 

V . ODD 


111 


1-23 


0 . 989 


ft ROQ 


126 


1-25 


0 . 955 


nam 


129 


1-19 


0 . 963 




138 


1-29 


0 . 971 


q t g44 


143 


1-18 


0 . 914 


0 . 628 


148 


1-20 


0.969 


0 . 904 " ™ 


156 


1-25 


0 . 941 


0 ft 1 1 

V . Oil 


1^8 


1-22 


0. 979 


u • y« / 


160 


1-17 


0 . 972 


u . yo y 


161 


1-48 


0.903 


07571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0.825 


180 


1-27 


0.981 


0.941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0 . 953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0.963 


0.936 
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SEQ ID NO: 


ACID SEOUENCE 


MaxS (MAXIMUM 


MeanS (MEAN 
SCORE} 


199 


1-20 


0.935 " 


0 . 701 


200 


1-23 


0 . 977 


u . / 1 A 


206 


1-30 


0.984 


0 . 890 


207 


1-19 


0 . 990 


n o?a "' 


208 


1-22 


0 . 974 


n a en 


210 


1-40 


0 . 940 


0 670 


211 


1-28 


0 . 971 


0*849 


216 " " 


1-24 


0.986 


0 . 956 


218 


1-33 


0.961 


0 895 


219 


1-19 


0.970 


u • 0 f k 


221 


1-19 


0.904 


n eel 


222 


1-21 


0 . 917 




230 


1-19 


0.991 


u . soy 


231 


1-26 


0 . 953 


0 an a 
u . ouv 


232 


1-25 


0.988 


n a *s <r 
U . BZo 


239 


1-23 


0.969 


0.628 


240 


1-17 


0.982 


0.955 


241 


1-17 


0*982 


0 . 955 


245 


1-30 


0 . 970 


0 . 722 


246 


1-22 


v • y r o 


0 . 935 


249 


1-23 




0 . 94 0 


252 


1-18 


u . y / X 


0 . 923 


261 


1-24 


n o q i 


0.587 


265 


1-18 


0.939 


0.868 


272 


1-24 ~ 


0 1 953 


0 . 739 


2B3 r 


i-2i 


u . y uo 


0 . 688 


564 


1-29 


A 0.0*7 
U • St Jr / 


0 . 854 


290 


1-31 


n q a *c 


0.841 | 


302 


1-28 ~" 


u . jo v 


0 . 893 


304 


1-16 


0 . 907 


0 , 635 


312 


1-19 


U . SJtj 


0 . 976 


313 


1-17 


U • y j U 


0 . 753 


323 




0 . 998 


0.909 


324 "■ 


1- 17 


U . JOZ 


0 . 954 


326 


1-19 


0.971 


0 . 865 


329 


1-22 




0 . 924 


330 


1-33 ~ 


0 . 978 


0 . 841 


331 


1-24 


n 

U . 94V 


0 . 712 


332 


1-24 


U . 3 / D 


0 . 881 


333 


.1-19 


O CO/1 


0 . 941 


334 


1-20 




0.567 


335 


1-27 


0 942 


0 . 813 


336 


1-20 


0 . 952 


O . 850 


337 


1-38 


0.942 




338 


1-27 


0.973 


0.772 


339 


1-36 




0 . 804 


340 


1-27 


0 . 8 8 B 


n c a *7 


343 


1-19 


0.971 


U . ODD 


344 


i_22 


0.994 


0 . 928 


345 


1-17 


0.966 


"n ■ 

U . 00/ 


346 


1-19 


0.936 




347 


1-22 


0,963 — 


n an a 


349 


1-24 


0 . 982 


0 . 966 


3bl 


1-21 




0 . 815 




1-31 


u . 9O0 


0 . 912 


354 


1-31 




0 . 839 


355 


1-29 


n q-jq 


0 . 632 


356 ■ " 


1-15 


0.994 


0 .969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.821 


j&l 


1-25 


0.954 


0.674 


362 


1-22 


0.929 


0 .788 


Jb3 


1-21 


0.681 


0.715 




1-33 


0.978 


0.841 


365 


1-33 


0.978 


0.841 
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POSITION Or 
oIuNALj IN AMINO 


MaxS {MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORS) 


366 


1-21 


0.916 


noon 


367 


1-19 


0 . 936 




368 


1-29 


0 . 972 


n a ia 

\J . O ( H. 


370 


1-24 


0 . 920 


U . / 1^ 


371 


1-24 


0 . 96i 


* n 4^*5 

U . / /J 1 


372 


1-27 


0 . 919 


0.768 


373 


1-19 


U . _/ O U 


0,945 


375 " 


1-32 


n qqj 
u . yy*± 


0.932 


376 


1-34 


0 . 987 


0.810 


377 


1-17 


0 . 995 


0 * 950 


378 


1-49 


0 . 971 


n *7 a o 
U . r *ty 


380 


1-20 ' 


U . jOO 


0 . 8 74 


381 


1-20 


n o*> o 

u . ^ZCJ 


0.782 


382 


1-19 


0.986 


n qui " '" 


383 


1-28 " 




0.829 


384 


1-39 


0 . 970 


0 .551 


386 


1-24 


n Q7C 

U . 3 /o 


0.881 


3 88 


1-3 0 




0 . 8 6" 8 


389 


^ _ -± q 


0.984 


0 . 941 


390 


1-26 


0 . 971 


0 .782 


392 


1-20 " 


0.981 


0 .900 


393 


1 _ - c — 

x - „o 


0 . 968 


0 . 890 


394 


X- « J 


0 .937 


0 .701 


397 




0 .985 


0 . 854 


399 


X- 


0 .977 


0.698 


401 


I- «U 


0 . 899 


0 .567 


402 




0 . 967 


0 . 931 


403 


1-27 


0 . 992 


0 .934 


404 


x- x;* 


0 . 991 


0 .973 


405 




0 . 994 


0 .921 


407 


1-35 


0 . 987 


0 .658 


408 




0 . 976 


0 . 551 


T09 




0 .897 


0 .570 


410 


X - 


0 . 990 


0 .962 


411 


X - ■» O 


0.37/ 


0 . 827 


412 


1 ,on 


0 . 944 


0 . 768 


413 


1-20 


0.988 


0 . 965 


414 




0 . 993 


0 .638 


415 " 


X "£J 


0 . 981 


0 . 940 


417" 


— 


0 . 941 


0 . 672 


418 


1-20 


0 . 952 


0.850 


419 




0 . 986 


0 . 967 


420 


2-29 


n ore 


0.861 


421 


1-22 '" " "~ 


n Q d a 


6.785 


422 


1-4 8 


U . 3 OA 


0 . 862 


424 


1-19 


n Q7Q 
U . 3 I J 


0 . 933 


428 


1-3 8 


0.942 


0.653 j 


430 


1-18 




u . by b 


432 


1-33 




0 . 789 


433 


1-26 




0 . 904 


434 


1-27 


0 . 962 


0 . 777 


435 


1-24 


b . 9"98 


n on? 
U . 5r 11 


436 


1-27 


\J .13 I 5 


0 . 772 


443 


i-isT" 


U . Sob 


0 . 940 


448 


1-36 


0 . 979 


0 . 804 


453 


1-41 




0 . 609 


455 


1-33 


0.943 


0 . 606 


457 


1-27 


0.888 * 


0 . 597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXTMOM 
SCORE) 


SCORE 1 


"5X1 


1-23 


0.930 


0 .593 


"512 


""1-23 ■ 


0.930 


0.593 


"515 


1-18 


0.978 


0 .956 


523 " 


" 1-19 


0.936 


0.822 


529 


1-22 


0.963 


0 . 924 


545 


1-24 


0.982 


0.966 


550 


"1-3 0 


0.933 


0 .713 


552 


1-21 


0.973 


0 .912 


554 


1-23 


0.969 


0 .784 


571 


' 1-21 


0.918 


0 . 815 


574 


1-31 


0.988 


0 . 912 


580 


" 1-39 


0.925 


0 .556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0.932 


0.632 


610 


1-21 


0.990 


0 . 948 


621 


1-15 


0 .994 


0 . 969 


623 


1-33 




0.726 


653 


1-27 


0.936 


0 . 827 


668 


1-22 


0,929 


0.788 


"677 . " 


1-16 


0 .948 


0.807 


685 


1-21 


0.881 


0.715 


"699 


1-22 


D .975 


0 . 816 


702 


1-31 


0.968 


0 . 898 


707 


i-i6 


6.860 


6 . 562 


713 


1-25 


0 .966 


0 . 743 


718 


1-19 


0.936 


0 . 822" 


719 


1-20 


0 .961 


0.824 


729 


1-29 


0.972 


0 . 874 


735 


1-46 


6 .903 


6 .598 


746 


1-14 


"0.916 " 


0.730 ■ — 


HI 


1-22 


0.965 


0.876 


748 


1-29 


0 .968 


0 . 785 


759 


1-24 


0.961 


0.773 


767 


1-27 


0 .919 


0 . 76 B 


768 


1-33 


0 .900 


6 . 585" ~~ 


773 


1-42 


0.959 


0 . 702 " 


779 


1-19 


0 .986 




797 


1-19 


0 . 944 


0.153 


798 


1-19 


l) .900 


0T5T8 


820 


1-17 


0.995 


0 . 950 


827 


1-49 


0.971 


O.749 


848 


1-20 


0.968 


0.874 


864 


1-20 " 


0 . 928 j 


0 . 782 


866 


1-19 


0.986 


0. 934 


873 


1-23 


0 . 948 


6.886 


681 


1-28 


0.965 


0.829 


887 


1-39 


0.970 


"6.551 


927 


1-30 


0.989 


0 .668 


934 


1-48 


0.988 


0 . 777 


939 


1-39 


0 . 994 


0 . 889 


944 


1-26 


6.971 


0 . 782 


950 


1-29 


0.957 


0 . 845 


~963 


1-20 


0.981 


0 . 900 


964 


1-20 


0 .686 


0.558 


973 


1-16 


0 . 968 


0 . 890 


980 


1-34 1 - 


0 . 961 


0.749 


981 


1-20 


0 . 953 


0.B22 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0 . 985 


0.854 


1040 


1^46 


0.977 


0.698 


L 1052 


1-18 


0.969 


0.842 


1059 


1-20 


0.927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


0.935 
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SEQ ID N&: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAY TMf IM 

SCORE) 


"cans (MfcAN 
SCORE I 


1075 


1-27 


0.992 


0 .934 


1060 


""1-19 


0.931 


0.829 


1092 


"1-19 


0.991 


0.973 


1094 


"1-46 


0.992 


0 .653 


1095 


1-30 


0.974 


0. 929 


1105 


1-23 


0.994 


0 .921 


1123 


1-35 


0.987 


0.658 


1138 


1-32 


0.954 


0.613 


114 0 


1-33 


0.989 


0.789 


1142 


1-33 


0.897 


0.570 


1152 


"1-25 


0.990 


0.962 


1170 


1-38 


0.977 


0.827 


1176 


1-20 


0.944 


0.768 


1187 


1-20 


0.988 


0 . 965 


1189 


1-35 


0.967 


0 . 839 


1192 


1-46 


0.993 


0 .638 


1193 


1-16 


0.925 


0 . 710 


1197 


1-29 


0.985 


0 . 853 


1208 


1-23 


0.981 


0.940 


1225 


1-29 


0.941 


0.672 


1245 


1-19 


0.986 


0.967 


1256 


1-29 


0.965 


0.861 


1265 


1-22 


0.8*9 " 


0.785 


1266 


1-20 


0.944 


0 . 809 


1276 


1-48 


0 .982 


0 . 862 


1292 


1-19 


0 .979 


0 . 933 


1296 


1-21 


0 .984 


0 . 944 


"1297 


'1-19 


0.984 


0 . 953 


1332 


1-38 


0 .942 




1358 


1-18 


0.947 




1371 


1-33 


0.957 


0.789 


1380 


1-2^ 


0.979 


0.904 


1397 


1-27 


0.9^2 


0*777 — 


1399 


1-23 


0.997 


0.960 


1404 


1-24 


0.998 


0,377 


1410 


1-15 


0.94£ 


0.845" ™ 


1414 


1-24 


0.913 


0.588 


1415 


1-19 


0.982 


0 . 929 


1416 


1-12 


0.931 


0.891 


1418 


1-30 


0.933 


6.5£j 


1420 


1-20 


0.881 


0.561 


1421 


1-19 


0.990 


0.968 


1423 


1-17 


0.96B 


0 . 9$3 


1424 


1-21 


0.885 


"0.591 ' — 


1425 


1-24 


0.913 


0 . 588 


1426 


'1-24 


0.913 


0.508 


1428 


1-25 


0.967 


0.899 


1430 


1-34 


U.977 


0 . 819 


1431 


1-28 


0.979 f 


0 . 923 


1432 


1-36 


0.957 


0 . 613 


1433 


1-32 


0.921 


0.753 


14J4 


1-39 


0.983 j 


0 . 621 


1435 


1-25 


0.910 


0.631 


1436 


T^T- 


0.988 


0 . 868 


1437 


1-22 


0 .998 


0.980 


1442 


1-20 


0.918 


0 . 753 ~ 


1448 


1-12 


0.931 


0.89^1 


1462 


1-18 


0.968 


0.888 




1-20 


0.881 


0.561 


1518 


1-17 


0.969 


0 .863 


1525 


1-21 


0.885 


0.591 


1547 


1-28 


0.974 


0 .891 


1561 


1^25 


0.967 


0 .899 


1580 
1S93 


1-17 
1-2S 


0.923 
0.979 


0.824 
0.923 
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SBQ ID NO: 


ACID SEQUENCE 


MaxS (MAXIMUM 
o^uiviS } 


MeanS (MEAN 
SCORE) 


1596 


1-16 


0.929 




1601 


1-36 


0 . 957 


0 9 g 13 


160«* 


1-22 


0.^79 


0.831 


" 1^07 


1-20 


0 . 974 


0.770 


1608 


1-32 


0.921 


0.753 


1614 


1-33 


0.969 " 


0 . 829 ' 


'l£l€! 


1-20 


0.959 


0.869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 


0.^31 


1636 


1-33 


0.897 


0.591 


1639 


1-42 


0.968 


0.868 


1645 


1-20 


0.927 


0.56 8 


1647 


1-17 


0 .923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 1 4 1 6234. 1 (%CR%01 ! .00C) 
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TABLE 6 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


I SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1 


1787 


3573 


5359 


764CIP2_1 


1103 


2 


1788 


3574 


5360 


784CIP2_2 


2673 


3 


1789 


357.5 


5361 


784CIP2 3 


4117 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


5 


1791 


3577 


5363 


784CIP2_5 


5562 


6 


1792 


3578 


5364 


784CIP2 6 


5562 


7 


1793 


3579 


5365 


784CIP2 7 


5562 


8 


1794 


3580 


5366 


784CIP2 8 


S562 


9 


1795 


3581 


5367 


784CIP2 9 


5563 


10 


1796 


3582 


5368 


784CIP2JL0 


5564 


11 


1797 


3583 


5369 


7B4CIP2_11 


55*5 


12 


1798 


3584 


5370 


784CIP2 12 


5689 


13 


1799 


3585 


5371 


784CIP2JL3 


5729 


14 


1800 


3586 


5372 


784CIP2JL4 


5745 


15 


1801 


3587 


5373 


784CIP2JL5 


5777 


l£ 


1802 


3588 


5374 


784CIP2_16 


5777 


17 


1803 


3589 


5375 


784CIP2JL7 


5789 


18 


1804 


3590 


5376 1 


784CIP2JL8 


5792 


19 


1805 


3591 


5377 


784CIP2_19 


5804 


20 


1806 


3*92 


537B 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 


22 


1808 


3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


24 


1810 


359S 


5382 


784CIP2 24 


5850 


25 


1811 


3597 


5383 


784CIP2 25 


5867 


26 


1812 


3598 


5384 


784CIP2 26 


5973 


27 


1813 


3599 


5385 


784CIP2 27 


5995 


28 


1814 


3600 


5386 


784CIP2_28 


5995 


29 


1815 


3^01 


5387 


784CIP2 29 


6005 


30 


1816 


3602 


5388 


784CIP2_30 


6007 


31 


1817* 


3603 


5389 


7B4CIP2 31 


6007 


32 


1818 


3604 


5390 


784CIP2 32 


6009 


33 


1819 


3605 


53S1 


784CIP2 33 


£012 


34 


1826 


3606 


5392 


7B4CIP2 34 


6015 


35 


1821 


3607 


5393 


784CIP2 35 


6016 


36 


1822 


3608 


5394 


784CIP2_36 


6016 


37 


1823 


3609 


5395 


7B4CIP2_37 


6018 1 


38 


1824 


3610 


5396 


784CIP2^8 


6018 


39 


1B25 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


7B4CIP2_40 


6023 " 


41 


1821 


3613 


5399 


784CIP2 41 


60 70 


42 


1828 


3614 


5400 


784CIP2 42 


6081 


43 


1829 


3615 


5401 


784CIP2 43 


6089 


44 


1830 


3616 


5402 


784CIP2 44 


6118 


45 


1831 


3617 


5403 


784CIP2 45 


6118 


46 


1832 


3618 


5404 


784CIP2 4£ 


6130 


47 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2 48 


6189 


49 


1835 


3621 


5407 


784CIP2 49 


6191 


50 


1836 


3622 


5408 


7B4CIP2 50 


£204 


51 


1837 


3623 


5409 


784CIP2 51 . 


6204 


52 


1838 


3624 


5410 


784CIP2 52 


6284 


53 


1839 


3625 


5411 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784CIP2 54 


643^ 


55 


1841 


3627 


5413 


784CIP2 55 


6442 


56 


1842 


3628 


5414 | 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


784CIP2 58 


645B 


59 " 


" 1845 " 


3^31 


5417 


784CIP2 59 


*458 
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SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


QT?n t n kts . 

3tt^ ±U PiiJ I 
n^" r\r> h* ^ fr 

nucleotidp 
sequence 


SEQ ID 

peptide 
sequence 


Priority 
docket number 
corresponding 

SEO ID NO* in 

priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
no /a oq tj c 


60 


1846 


3632 


5418 


784CIP2 60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1848 


3634 


5420 


784CtP2 62 


6499 


63 


184$ 


3*3* 


5421 


784CIP2 63 


6499 


64 


1850 


i 3636 


5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


784CIP2 65 


6534 


66 


1852 


3638 


5424 


784CIP2 66 


6534 


67 


1853 


" 3639 


542* 


784 CI P2 67 


654 0 


6B 


1854 


3640 


5426 


784CIP2 68 


6 550 


69 


1855 


" 3641 


5427 


i 784CIP2 69 


6550 


70 


1856 


3642 


5428 


784C.JP2 70 


6592 


71 


1857 


3643 


5429 


784CIP2 71 


6645 


72 


1358 


3644 


S430 


784C1P2 72 


Do / I 


73 


1959 


3645 


5431 


784CIP2 73 


D / D J 


74 


1860 


3646 


5432 


784CIP2 74 


O / DJ 


75 


1361 


""' 3647 


5433 


784CIP2 75 


6786 


76 


1862 


3*48 


5434 


784CIP? 7fi 


£ R OA 


77 


1863 


3649 


5435 


784CIP? 77 


6830 


78 


1864 


3650 


5436 


784CTP2 7fi 




79 


1865 


3651 


5437 


7S4CIP2 7CJ 


b o J 4 


BO 


1866 


3652 


5438 


7fl4fTP9 Rfl 
>a*±\*Xrc. OU 


6834 


si 


1867 


3653 


5439 


7fi4r*TP9 fll 


6834 


82 


1858 


3654 


5440 




6835 


83 


1859 


3655 


5441 


784CjP2 


oo3 7 


64 


1870 


3 656 


" ^44"3 


7S4Cyp7 RA 


6843 


85 


1871 


3657 


5443 




6859 


86 


1872 


3658 


5444 


I ol^X sr £ Ob 


6915 


87 


1873 


3659 


5445 


7B4C7P7 R7 


6932 


88 


1874 


3660 


5446 


7B4CTP5 Rfl 


6957 


89 


1875 


3661 


5447 


784CIP9 fiQ 


gcTZi 


90 


1876 


3662 


5448 


7R4PTP9 Qf» 


cori " 

6973 


91 


1877 


3663 


5449 


" 7B4r7P2 Q-i 


6973 


92 


1878 


3654 


5450 




/UU7 


93 


. 1879 


3665 


5451 


7fi4CTP2 QA 


7018 


94 — 


1880 


3666 


5452 


7B4PTP7 qTc;~ "'" 


7019 


95 


1881 


3667 


5453 




7020 


96 


1882 


3668 j 


5454 


7R4PTP7 Q7 


/U4U 


97 


1883 


3669 


5455 


7B4PTP7 "QQ 


7021 


98 


1884 


3670 


5456 


784PTP? QQ 


7fV51 " 


99 


1885 


3671 


5457 


7B4CIP2 inn 


/U4 / 


100 


1886 


3672 


5458 


784CIP2 101 


/ o 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


/UJI 


103 


1889 


3675 


5461 


784CIP2 104 




104 


1890 


3676 


5462 


784CIP2 ins 


7nvi 


105 


1891 


3677 


5463 


784CIP2 10&" 


7035 


106 


1B92 


367B 


5464 


784CIP2 107 


701G 


107 


1893 


3679 


5465 


784CIP2 108 


7039 


108 


1894 


3680 


5466 


784CIP2 109 


7043 


I 109 


1895 


3681 


546^ " " 


784CIP2 110 


70d 4 


110 


'" 18$6 


3662 


5468 


784CIP2 111 




111 


1897 


3683 


5469 


784CIP9 119 




112 


1898 


3684 


5470 


784CIP7 117 


/vol 


113 


1899 


3685 


547'f 


7B4CIP7 114 


77FV5 

/v / / 


[_ 114 


1900 


3*8* 


5472 


784CIP2_11S 


7092 


115 


1901 


3687 


5473 


784CIP2 116 


7094 


116 


1902 


3688 


5474 


7S4CIP2 117 


7106 


117 


1903 


3689 


5475 


7B4CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2_119 


7111 


119 


1905 


3691 


5477 


784CIP2_120 


7123 


120 


1906 


3692 


5478 


784CIP2_121 


7142 


121 


1907 


3693 


5479 


784CIP2 122 


7142 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


oty 2D 

NO- rtf 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
oi contig 


SEQ ID 
NO : 

oi conciy 


Priority 
docket number^ 
corresponding 
ocy iu i in 

pi 1UI1 Ly 

application 


SEQ ID 
NO : in 
U.S. S.N. 
09/488, 725 


122 


1908 


3694 


5480 


784CIP2 123 


/ IDS 


123 


1909 


3695 


5481 


784CIP2 124 


7 1 <n 

/ IOU 


124 


1910 


3696 


5482 


i 784CIP2 1?5 


t lb? 


125 


1911 " 


3697 


5483 


784CIP2 126 


/ 1(93 


126 


1912 


3^98 


5484 


784CIP2 127 


71 QT 

/ 1 y / 


127 


1913 


3699 


5485 


704CIP2 128 


77 1 a — 1 


128 


1914 


3700 


5486 


784CIP2 129 


77 TC 


12$ 


1915 


3701 


5487 


784CIP2 130 


7 7 7 a 


130 


1916 


3702 


5488 


1 784CIP5 iil 

/ o ** v_ J. c <. 1J X 


7234 


131 


1917 


3703 


5489 


784CIP2 13? 


77 1 C 


132 


1918 


3704 


5490 


784CIP2 133 




133 


1919 


3705 


5491 


7R4PTP7 134 


723 8 


134 


1920 - 


3706 


5492 




/*4 / 


135 


1921 


3707 


5493 


7B4CIP7 n/; 


/Zol 


136 


1922 


3708 


5494 


7fi4r*TP5 1^7 


7262 


137 


1923 


3709 


5495 


' o 1 L 1 r z ijo 


726*7 


138 


1924 


3710 


"" 5496" 




7272 


139 


1925 


3711 


54 97 "' 


/04\-lJr*: 14v 


7273 


140 


1926 


3712 


5498 




7282 


141 


1927 


3713 


5499 


/OlLlf<i l**/ 


7288 


142 


192B 


3714 


5500"" 


/o4(_lF2 143 


7291 


143 


1929 


3715 


5501 




7293 


144 


1930 


3716 


5502 


7R4PTD1 l^C 
/04L1P2 145 


72 94 


145 


1931 


3717 




/□4L.iJrZ 14o 


7299 


145 


1932 


3718 


5504 


/o4Lif4j 14 / 


7300 


147 


1933 


3719 


5505 


/tj<lLil'<i lie 


7312 


148 


1934 


3720 


5506 


/04t.l£'2 143 


7313 


149 


1935 


3721 


5507 


#D4v.lJr£ 13U 


7315 


ISO 


1936 


3722 


55C8 


/04U1P2 151 


7318 


151 


1937 


3723 


5509 


/B4ulir2 15z 


7321 


152 


1938 ■"" 


3724 


5510 


'o^vlrt 15J 


7330 


153 


1939 


3725 


5511 " 


/C4*-Ir2_134 


7331 


154 


1940 


3726 


5512 


/D4U1JF2_155 


7333 


155 


1941 


3727 


5513 




7350 


156 


1942 


3728 


5514 


/a4l_lJV<i lb / 


7352 


157 


1943 


3729 


5515 


/O^^lr^ IDD 


7384 


158 


1944 


3730 


5516 


/d<iUlJf<4 15? 


7403 


159 


1945 


3731 


5517 


/□4l_lJr4 loU 


7431 


160 


1946 


3732 


5518 


/04\»lJr« lol 


7441 


161 


1947 


3733 


5519 




7453 


162 


1948 


3734 


5520 


7ft<ir , TP7 i 




163 


1949 


3735 


5521 


7RflPTP7 1 C/l 
» UTUir/ 1 0 *a 


7471 


164 


1950 


3736 


5522 




7493 


165 


1951 


3737 


5523 


7ft4C" 1 TP7 ICC 

» o^^.ijt<; loo 


7502 


166 


1952 


373 8 


5524 


784CIP? 1 fi7 


TCI 1 

/Oil 


167 


1953 ! 


3739 


5525 




7C T A ■ ' 

/5JL4 


168 


1954 


3740 


5526 


7&4CIP2 169 




169 


1955 


3741 ; 


5527 


7A4CTP7 1 7n 
' u"v.lr£ 1/U 


/541 


170 


1956 


3742 


5526 


784CTP? 1 71 




171 


1957 


3743 


5529 


7RAPTP0 1 
/04^1r'Z L/Z 


7578 


172 


1958 


3744 


5530 


/ o^Lir^ 1/J 


75 83 


173 


1959 


3745 


5531 


/0^ik,lr^ 1/4 


7592 


174 


1960 


3746 


5532 


' 0*\-llrZ 1 / D 


7601 


175 


1961 


3747 


™ 5533 


/a4Lli J ^ I/O 


7602 


176 


1962 


3748 


5534 


784CIP2 177 


7608 


177 


1963 


3749 


5535 


784CIP2 178 


7615 


178 


1964 


3750 


5536 


784CIP2 179 


7617 


179 


1965 


3 751 


5537 


784CIP2 181 


7624 


180 


196"6 


3752 


5538 " 


784CIP2_182 " 


76'26 i 


1B1 


1967 


3753 


5539 


784CIP2 183 


7640 


1B2 


1968 


3754 


5540 


784CIP2 184 


7641 


183 


196" 9 


3755 


5541 


784CIP2 185 


7641 
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fipo t n mo • ~~ 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO ' of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nut, xcol xqs 


SEQ ID 
NO : 

of contig 


Priority 
docket number^ 
corresponding 

ony xu Liu : in 
jut Jtuncy 


.SEQ ID 
NO: in 
U.S.S .N. 

t\ a. / a a a 11 r 

09/4B8, 725 


184 


1970 


3756 


554 2 


784CIP2 186 


/ OH Jl 


185 


1971 


3757 


5543 


784CIP2 1S7 

» O wX XT A AO/ 


1 D44 


186 


1972 


3758 


5544 




Id A Q "™" 


187 


1973 


3759 


i 5545 


784CTP2 1 HQ 


7CCC 
/ ODD 


188 


1974 


3760 


5546 


7B4CIPP 19fl 

/ U * W J* ^ £m A- ^ \J 


7657 


189 


1975 


3761 


5547 




7657 


190 


1976 


3762 


5548 




/ bt>4 


191 


' " 1977 




5549 


784CIP2 193 


/DO O 


15)2 


1978 


3764 


5550 


784CIP2 194 


7673 


193 


1979 


3765 


5551 


784CIP2 19S 


"™7'2Tft o 


194 


1980 


3766 


5552 


7B4CTP? IQfi 


T7nn 
/ / uy 


195 


1981 


3767 


5553 


7S4CTP2 197 

» O v» X * 4* X / 


*7 *7 A Q 


196 


1982 


3768 


5554 




7736 


197 


1983 


3769 "' 


" 5555 


/ OnLlr^ x^7 


7737 


198 


1984 


3770 


5556 


TSiir'TTao on a 
/ 0*4*- if 4 4UU 


7744 


199 


1985 


3771 


" 5557 


IRAHfOD OfSl 


7771 


200 


l£86 "" 


3772 


5558 


'oi^lf^ 4U4 


7786 


201 


1987 


3773 


5559 


OAl 

/ O^LIfZ 4UJ 


7791 


202 


1988 


3774 


CCCrt 

DD 0 U 


/o4CXP2__204 


7797 


203 


1989 


3775 


5561 


/04L1P2 405 


7806 


204 


1990 


3776 


5562 




7812 j 


205 


1991 


3777 




/84CXP2 207 


7812 


206 


1992 


3778 


330 t 


n ha ct m oao 

784CIP2 208 


7818 


207 


1993 


3779 




/M4L.1P4 409 


7822 


208 


1994 


3780 


5566 


/Q4CXP2 4X0 


7827 


209 


1995 


3781 


5567 


/04L.X.F2 4X1 


7830 


210 


199S 




330 0 


/o4CIP2__2I2 


7835 


211 


1997 


3783 




/o4LXP4 4X4 


7840 


212 


199B 


3784 


" 5570 


/o4LXP4 4X5 


7858 


213" 


1999 


3785 


"5571 


/o4CXP2 4X6 


7858 


214 


2000 


3786 


33 I 4 


/B4CIP2 217 


7861 


215 


2001 


3787 


jj / J 


/o4CIP2 21 B 


7866 


216 


2002 


3788 


5574 


/o4CIP2^4X9 


7868 


217 


2003 


3789 


5575 


44 0 


7896 


218 


2004 


3790 


5576 


Tfl A/*"TDO 'ill 

/ □ 4C1P4__241 


7898 


219 


2005 


3791 


33 / / 


/ 04C1P2_222 


7900 


220 


2006 


3792 


557 8 


/ o *i L.1 fc*4 443 


7906 


221 


2007 


3793 ' ' 


£579 


/D4l,lr« 444 


790B 


222 


2008 


3794 


5580 


/e4l.xir4 443 


7909 


223 


2009 


3795 


5581 


/ G%VXf4__4 4© 


7917 


224 


2010 


3796 


5582 


'01vir6 44 / 


7932 


225 


2011 


3797 


5583 




794 0 


226 


2012 


3798 


5584 


/OSv»±±*4 44? 


7940 


227 


2013 


3799 


5585 


/ Oilier 4 ^£JU 


7984 


228 


2014 


3800 


5586 




*7Q Q/l 

/ y o4 


229 


2015 


3801 


5587 




8001 


230 


201* '■' 


3802 


5588 




Q AOl 
OU4X 


231 


2017 


3803 


5589 


/OH»-Xlr* XJ*± 


8029 


232 


2018 ; 


3804 


5590 


ru*il.Xl74 Z J j 


8033 


233 


2019 


3805 


5591 


7 fl d /*• T DO nt! 


804 0 


234 


2026 


380£ 


5592 


7fl^PTOO Oil 


8052 


235 


2021 


3807 


5593 


*7 O A /*» T T5 O O 

/04UXP4 4jb 


8096 


236 


2022 


3808 


5594 


i w*\,i.tr C 4Jy 


8096 


237 


2023 


3809 


5595 


44 0 


8113 


238 


2024 


3810 


5594 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2 242 


8132 


240 


2026 ~ T 


3812 


5598 


784CIP2 243 


8137 


241 


2027 


3813 


5599 


784CIP2 244 


8137 


242 


2028 


3814 


5600 


784CIP2_245 j 


8159 


243 


2029 


3815 


5S01 


784CIP2 246 


8159 


244 


2030 


3816 


5602 


784CIP2_247 


8161 


245 


2031 


3817 


5603 


784CIP2 248 


8176 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 

ism • 
. or 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 


SEQ ID 

NO : 

Of contig 
pept ide 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


246 


i 2032 


3818 


5604 


784CIP2_249 


8196 


247 


2033 


3819 


5605 


784CIP2_250 


8200 


248 


20*4 " 


3820 


5606 


784CIP2 251 8212 


249 


2035 


3821 


5607 




8220 


250 


2036 


3822 


5608 




8238 


251 


2037 


3823 


5609 


TftdPTU'J OKA 


8254 


252 


2038 


3824 " 


5610 




8255 


253" 


2039 


3825 


5611 


/ B 41 V1P2 256 


8288 


254 


2040 


3826 


5612 


/ o *t Cx F2_2 b 7 


8296 


255 


2041 


3827 


5613 


/o4CxP2 258 


8329 


256 


2042 


3828 


5614 


7B4CIP2 259 


8362 


257 


2043 


3829 


5615 


/84CIP2 260 


8429 


258 


2044 


3830 


5616 


784CIP2 261 


8436 


255 


2045 


3831 




784CIP2 262 


8448 


260 


2046 


3832 


5618 


784CIP2 263 


8472 


261 


2047 


3833 


DO X 7 


784CIP2 264 


8502 


262 


2048 


3834 


DDZU 


784CIP2_265 


8504 


263 


2049 


3835 


30« 1 


784CIP2 266 


8507 


264 


2050 


3836* — 




784Cx'P2 268 


8509 


265 


5051 


3 837 


"cd^i 

DOZ,} 


784CIP2 269 


8515 


266 


2052 


3838 




784CIP2 270 


8519 


267 


2053 


"" 3839 


DOZ D 


784CIP2_271 


8530 


268 


2054 


" 3 B40 


RCOC 


784CIP2 272 


8532 


269 


2055 


"3841 


CLfiOi 


784CIP2 273 


8532 


270 


2056 


3842 




784CIP2 274 


8539 


271 


2057 


3 843 




784CIP2 275 


8541 j 


272 


2058 


3 844 


JO J U 


784CIP2 276 


854 3 


273 


2059 


3845 


->oJi 


784CIP2 277 


8593 


274 


2060 


3 846 


So J < 


784CIP2 278 


8595 


275 


2061 


3 847 


3DJJ 


784CIP2 279 


8615 


276 


2062 


3 848 


DoJ3 


784CIP2 280 


8620 


277 


2063 


3849 


5635 


784CIP2 281 


8621 


278 


206"4 


3850 ""' 


DO J O 


784CIP2 282 


8623 


279 


2065 


3851 


563 7 


784CIP2_283 


8625 


280 


2066 


3 852 


c<:i o 
DOJ o 


784CIP2 284 


8628 


281 


2067 


3853 




IO/pt m nor 
/a4l_IP2 285 


8628 


282 


2066 


3854 


5640 


/o4LIP2 286 


8629 


283 


2069 


3 855 




/0 4L.XP2 287 


8630 


284 


2070 


3856 


DO** < 




8631 


285 


2071 


3857 


5643 




8633 


286 


2072 


3858 


5644 


784d¥b^ ^>Qn 


aoj4 


287 


2073 


3859 


5645 




863S 


288 


2074 


3860 


5646 


/o^LIP2 292 


8636 


289 


2075 


3861 


5647 


/ 04 1.X Jr2 2y j 


8659 


290 


2076 


3862 


5648 


/04LJLP2 294 


8660 


291 


2077 


3863 


5649 


784CIP2 295 


B667 


292 


2078 


3864 


5650 


784CIP2_296 


8667 


293 


2079 


3865 


5651 


784CIP2 297 


8685 " 


294 


20BO 


3866 


5652 


784CIP2 298 


" 8805 


295 


2081 


3867 


5653 


784CIP2J299 


8896 


296 


2082 


3868 


5654 


784CIP2 300 


8978 


297 


2083 


3869 


D ODD 


784CIP2_301 


9046 | 


298 


2084 


3870 


5656 


784CIP2 302 


9048 


299 


208S 


3871 


3o3 / 


784CIP2 303 


9116 


| 300 


2086 


3872 


5658 


784CIP2 304 


9195 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


302 


2088 


3874 


5650 


784CIP2_306 


9307 


303 


2089 


3875 


5661 


784CIP2_307 


9321 


304 


2090 


3876 


5662 


7B4CIP2 308 


9397 


3 05 


2091 


3877 


5663 j 


784CIP2 309 


9405 


306 


2092 


3878 


5664 


784CIP2 310 


9406 


307 


2093 


3879 


5oo5 


784CIP2 311 


9422 
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SEQ ID NO: 
of full- 
length 
nucleotide 


SEQ ID 
NO: of 
full- 
length * 
peptide 
sequence 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ZD 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ XD 
NO: in 
U.S. S.N. 
09/488,725 


308 




JtJBU 


5666 


784CIP2_312 


9494 


309 




■JOfll 

JOOl 


ccci 

boo / 


784CIP2 313 


9512 


3 f0 


2096 


3882 


00 OO 


784CIP2 314 


9632 


311 


2097 


3883 


jo 0 y 


784CIP2_315 


9661 


312 


2098 


3884 


30 /U 


784CIP2_316 


9664 


313 


2099 


3885 


90 r X 


784CIP2 317 


9691 


314 


2100 


3886 


OO / Z 


' 784CIP2JJ1B i 9700 


315 


2101 


3887 


OO /J 


784CIP2 319 


9716 


316 


2102 


3688 


90 / S 


784CIP2 320 


9721 


317 


7 1 ft"* 




5675 


784CIP2_321 


9870 


318 


7 i nd 


7 son 


bo /o 


784CIP2_322 


9887 


315 


2105 




5677 


784CIP2 323 


9923 


320 


5 1 ft£ 




5678 


784CIP2_324 


9938 


321 


7 1 m 
/ 


3893 


5679 


784CIP2_325 


9964 


322 

?w 




3 894 


5680 


784CIP2_326 


10007 




2109 


3 895 


5681 


784CIP2_327 


10009 




2 110 


3896 


5682 


784CIP2_328 


10046 


""" 77 c 


2111 


3897 


5663 


784CTP2_329 


10156 


J/D 


2112 


3898 


5684 


784CIP2_330 


10276 


157 
o« / 


2113 


3899 


5685 


784CTP2_331 


10283 




2114 


3900 


5686 


784CIP2B 1 


152 




2115 


3901 


5687 


784CIP2B 2 


167 


■Jin 


2116 


3902 


5688 


7B4CIP2B_3 


205 


331 


«1JL / 


3 903 


5689 


784CIP2B 4 


210 


332 


^ lit! 


3904 


5690 


784CIP2B_5 


225 


Tv5 


2119 


3905 


5691 


784CIP2B_6 


226 




2120 


3906 


5692 


784CIP2B 7 


264 




2121 


3907 


5693 


784CIP2B 8 


268 




2122 


3908 


5694 


784CIP2B 9 


293 




2123 


3909 


5695 


784CIP2B 10 


293 


J JO 


2124 


3 910 


5696 


784CIP2B 11 


293 


7 7 Q 


2125 


3911 


5697 


784CIP2B_12 


302 


"■ ■ 340 


2126 


3 912 


5698 


784CIP2B 13 


311 


341 


2127 


3913 


5699 


784CIP2B 14 


352 


342 


2128 


3914 


5700 


784CIP2B_15 


358 


343 




3915 


5701 


784CIP2B_16 


368 


344 


P23 0 


7qi c 


5702 


784CIP2B 17 


3 93 


345* 


213 1 


jyi / 


5703 


784CIP2B 18 


477 


346 


2132 


3918 


5704 


784CIP2B 19 


508 


347 


2133 


7 01 0 


5705 


784CIP2B 20 


508 


348 


2134 


3920 


b /Ub 


784CIP2B 21 


515 


349- " " 


2135 


J 7ZJ. 


5707 


784CIP2B 22 


578 


350 


2136 




— 57 ftp 
b /US 


784CIP2B 23 


568 


351 


2137 


J ^ t J 


5709 


784CIP2B 24 


591 


352 


213 B 


7 Q7 A 


5710 


" 784CIP2B 25 


593 


353 


213 9 


3 925 


5711 


784CIP2B 26 


594 


354 


2140 




5712 


" 784CIP2B 27 


. 619 


355 


2141 


3927 


5713 


784CIP2B_28 


620 


356 


2142 


3 92 B 


5714 


784CIP2B 29 


$54 


357 


2143 




5715 


784CIP2B 30 


692 


358 


2144 


loin 


5716 


784CIP2B 31 


753 


359 


2145 


3931 


5717 


784CIP2B 32 


758 


360 


2146 




5718 


784CIP2B 3,3 


787 


3£l 


214 7 


7 Q7 1 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


" 7B4CIP2B 3S 


838 


363 


2149 


3935 


5721 


784CIP2B 36 


870 


364 


2150 


3936 


5722 


7B4CIP2B 37 


891 | 


365 


2151 


3937 


5723 


784CIP2B 38 


891 


366 


2152 


3938 


5724 


784CIP2B 39 


921 


367 


2153 


3939 


5725 | 


784CIP2B_40 


924 


366 


2154 


3940 


5726 


784CIP2B_41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


NO; of 
full- 
length 
peptide 
sequence 


scy ±u n\j » 
of eon Ha 

nuc 1 eot i de 
sequence 


SEQ ID 
Mrt • 

wj : 

peptide 
sequence 


Priority 
docket nuraber_ 
corresponding 

cpA TD NO ■ -i n 

•J -i-»V XU Vf\J » -L J I 

application 


SEQ ID 
NO: in 
U . S . S . N . 
u?/ loo, /ZD 


370 


2156 


3942 


5728 


784CIP2B 43 


958 


371 


2157 


3943 


5729 


784CIP2B 44 


968 


372 


2158 


3944 


5730 


784CIP2B_4S 


992 


373 


J 153 " 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 


2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


" 5734 


784CIP2B 49 


1114 


377 


2163 


3949 


5735 


784Clt>2B 50'"™ 


1144 


378 


2164 


3950 


5736 


784CIP2B 51 


1262 


379 


2165 


39S1 


5737 


784CIP2B 52 


1318 


380 


2166 


3 952 


5738 


784CIP2B 53 


1319 


381 


2167 


3953 


573 9 


784CIP2B 54 


1328 


382 


2168 


3954 


5740 


784CIP2B 55 


1436 


383 


2169 


3955 


5741 


784CIP2B 56 


14 64 


384 


2170 


3956 


574 2 


784CIP2B 57""' 


1 CO/ 


385 


2171 


3 957 — 


5743 


784CIP2B 58 


1^17 


386 


2172 


3958 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


5745 


784CIP2B 60 


1728 


388 


2174 


3960 


574 6 


784CIP2B 61 


1772 


389 


2175 


3961 


5747 


784dtP2B""S"2 


1809 


390 


2176 


3962 


5748 


784CIP2B 63 


1868 


391 


2177 


3963 


5749 


784CIP2B 64 


TTTSft 


392 


2178 


3964 


" 5750 


784CIP2B 65 


J. 3 1 0 


393 


2179 


3965 


5751 


784CIP5R 66 


T$(Zk 


394 


2180 


3966 


5752 


784CIP2B "6 7 


1 OCT 


395 


2181 


3 967 


5753 


7B4CIP2R flft 


1 QBE 


396 


2182 


3968 


5754 




2005 . 


397 


2183 


3969 


5755 




mo? 


398 


2184 


3 970 


5756 


7R4f'TD7H 71 


2055 


399 


2185 


3 971 


5757 


7B4rTP?n 79 


*5 i m 


400 


2186 


3972 


5758 




«XuD 


401 


2187 


3973 


5759 


7S4CIP7R 74 




402 


2188 


3974 


5760 


784CIPiB i£ 




403 


2189 


3975 


5761 


784CIP2B 76 


2176 


404 


2190 


3976 


5762 


784CIP2B 78 


223 6 


405 


2191 


3977 


5763 


784CIP2B 79 


2250 


406 


2192 


3978 


5764 


7B4CIP2B 80 


2306 


407 


2193 


3979 


• Si-6& 


784CIP2B 81 


2323 


408 


2194 


3980 


5766 


784CIP2B 82 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784CIP2B 84 


2399 


411 


2197 


3983 


5769 


784CIP2B 85 


24ll 


412 


2198 


3984 


5770 


784CIP2B 86 


2426 


413 


2199 


3985 


5771 


784CIP2B 87 


2430 


414 


2200 


3986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B 69 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461" ' 


417 


2203 


3989 


5775 


784CIP2B_91 


2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


5777 


784CIP2B 93 


2512 


i 420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 j 


3994 


5780 


784CIP2B 96 


2816 


423 • 


2209 


3995 


5781 


784CIP2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 


425 


2211 


3997 


5783 


784CIP2B 99 


2943 


426 


2212 


3996 


5784 


784CIP2B_10Q 


3137 


427 


2213 


3999 


5785 


784CIP2B 101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


q f eon h i a 

nucleotide 
sequence 


NO : 

of conh Jo 

peptide 
sequence 


Priori ty 
docket number 
ui x. ccs jjuziczxiiy 
SEO ID NO- in 

priority 
application 


SEQ ID 
NO: in 

nq /dRQ ttc 


432 


2218 


4004 


5790 


784CIP2B 106 


3417 


433 


2219 


""" 4005 " 


5791 


784CIP2B_107 


341B 


434 


2220 


4006 


5792 


784CIP2B__108 


3442 


435 


2221 


4007 


5793 


784CIP2B_109 


3442 


436 


2222 


4008 


5794 


784CIP2B_110 


3444 


437 


2223 


4009 


5795 


7B4CIP2B_111 


3855 


438 


2224 


4010 


5796 


784CIP2B_112 


3863 


439 


2225 


4011 


5797 


7B4CIP2B 113 


4090 


440 


2226 


4012 


5798 


784CIP2B 114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B_115 


4142 


442 


2228 


4014 


5800 


784CIP2B 116 


4142 


443 


2229 


4015 


" 5801 


784CIP2B 117 


4149 


444 


2230 


401* 


5802 


784CIP2B 118 


4196 


445 2231 


4017 


5603 


784CIP2B 119 


4202 


446 


2232 


4018 


5804 


784CIP2B 120 


4274 


44 7 


2233 


4019 


5805 


784CIP2B 121 


4304 


448 


2234 


4020 


5806 


784CIP2B 122 


4306 


449 


2235 


4021 


5807 


784CIP2B 123 


4311 


450 


2236 


4 022 


5808 


784CIP2B 124 


4321 


451 


2237 


4023 


5809 


784CIP2B 125 


4323 


452 


2238 


4024 


5810 


784£XP2B 126 


4332 


453 


2239 


4025 


5811 


784CIP2B 127 


4488 


454 


2240 


4026 


5812 


784CIP2B 128 


4588 


455 


2241 


4027 


5813 


7B4CIP2B 12Q 




456 


2242 


4028 


5814 


784CIP2B 130 


5573 


j 457 


2243 


4029 


5815 


784CIP2B 131 


5577 


458 


2244 


4030 


5816 


7B4CIP2B 135 


5579 


459 


2245 


4031 


5817 


784CIP2B 133 


5582 


460 


2246 


4032 


5818 


784CIP2B 134 


5583 


461 


2247 


4033 


5819 


784CIP2B 135 


5584 


462 


2248 


4034 


"" 5820 


784CIP2B 136 


5585 


463 


2249 


4035 


5821 


784C2P2B 137 


5591 


464 


2250 


4036 


S822 


784CIP2B 138 


5593 


465 


2251 


4037 


5823 


784CIP2B 139 




466 


2252 


4038 


5824 


784CIP2B 140 


5594 


467 


2253 


4039 


5825 


784CIP2B 141 


5598 


468 


2254 


4040 


5826 


784CIP2B 142 


5602 


469 


2255 


4041 


5827 


784CIP2B 143 


5605 


470 


2256 


4042 


5828 


784C1P2B 144 " 


5608 


471 


2257 


4043 


5829 


7B4CIP2B 145 


5617 


472 


2258 


4044 


5830 


784CIP2B 14$ 


5620 


473 


2259 


4045 


5831 


784CIP2B 147 


5622 


| 474 


2260 


4046 


5832 


784CIP2B 14 8 


5623 


475 


2261 


4047 


5833 


7B4CIP2B 149 


5624 


476 


2262 


4048 


5834 


784CIP2B 150 


5625 


477 


2263 


4049 


5835 


784CXP2B 151 


5627 


478 


2264 


4050 


S836 1 


784CIP2B_152 


5628 


479 


2265 


4051 


5837 


784CIP2B 153 


S630 


480 


2266 ( 


4052 


5838 


784CIP2B_154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 


4054 


3640 


784CIP2B 156 


5641 


483 


2269 


4 055 


5841 


784CIP2B 157 


5643 


484 


2270 


4056 


5842 


784CIP2B 158 


5647 


485 


2271 


4057 


5843 


784CIP2B 159 


5649 


486 


2272 


4058 


5844 


784CIP2B 160 


5658 


487 


2273 


4059 


5845 


784CIP2B_161 


5659 


488 


2274 


4060 


5846 


784CIP2B_162 


5667 


489 


2275 


4061 


5847 


784CIP2B_163 


5672 


490 


2276 


4062 


5848 


784tlP2B 164 


5674 


_491 * 


2277 


4063 


5849 


7B4CIP2B 165 


5678 


492 


2278 


4064 


58S0 


784CIP2B_166 


5680 | 


493 


2279 


4065 


5851 


7b4CIP2B_l67 


56 84 ] 
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oby ID NO : 

nucleotide 
sequence 


SEQ ID 
NO; of 

length. 
Deri tide 
sequence 


SEQ ID NO: 
of contig 
nuc 1 eot ide 

sequence 


SEQ ID 
NO : 

of contig 
pep cx as 


Priority 
doefcet number 
c or re s pondi ng 
D&U i-U nu: in 
jjx iux x ty 
rr A(,t »"^™ 


SEQ ID 
NO: in 
U. S.S.N. 
09/488, 725 


494 


2280 


406* 


5852 


784CIP2B 16fi 


5686 


495 


2261 


4067 


5853 


784CIP2B 169 




49* 


2282 


4068 


5854 


784CIP2B 170 " 


5^98 


497 


2283 


4069 


5855 


784CIP2B 171 


5 699 


498 


2284 


4070 


5856 


784CIP2B 172 


5712 


499 


2285 


4071 


5857 


784CIP2B 173 


R71 9 i 
9 » X7 


500 


2286 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


7B4GIP2B 175 ' 


£77 7 


502 


2288 


4074 


5860 


784CTP2B 17G 


C 77 n 


503 


2289 


4075 


5861 


7R4CTP7R 177 




504 


2290 


4076 


5862 


7ft4rTD7B 1 7R 


573 8 


505 


2231 


4077 


5863 




573 9 


506 


2292 


4078 




7ftAfTD70 i on 


5740 I 


507 


2293 


4079 


5865 


/OlLlr^D iol 


5744 


508 


2294 


4080 


C O £ £ 


/OlulrZD XoZ 


5748 


509 


2295 


4081 


5" 86 7 


/ o fl V. X Jr .£ D X b J 


5749 


510 


2296 


4082 


coco 
da do 




5750 


511 


2297 


4083 


5869 


7B4nTD')n T etc 
I WK-ltfZo XOO 


5750 


512 


2298 


4084 


jO 'u 


/t)^Ulr*D lob 


5750 


513 


2299 


4085 


5871 


/ olLlr/D Xo / 


5761 


514 


2300 


4086 


DO / <£ 


TQ^r'TOItJ TOO 
/O i lV-Xr2B loo 


5762 


515 


2301 


4087 


CO 7 1 


/o4L.xr2B 189 


5767 


516 


23 02 


4088 


COT/ 

SO / % 


/O^LiXJrZo 13U 


5773 


517 


2303 


4089 


CQ7C 
DO / D 


/o4ClP2B 191 


5783 


51 B 


2304 


4 0 90* 


Do / O 


7 Q art 137 b i on — " 

/B4CIP2B 192 


5784 


519 


2305 


4091 


turn 


/94CIP2B 193 


5788 


520 


2306* 


A OQ7 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B 196 


5807 


522 


2308 


4 094 


5880 


/o4CIP2B^_197 


5818 


523 


2309 




5881 


/o4CIP2B 198 


5819 


524 


2310 




5882 


784CIP2B 199 


5827 


525 


2311 


4097 


3DOJ 


/o4CxP2B < _2QQ 


5628 


526 


2312 


4098 




/oILlr^B 20 X 


5842 


527 


2313 


4099 


5885 


TftAf^TTiO©. Ortl 
/o^k^.Xi'^Js 


5653 


528 


2314 


4100 


5886 


7QAf , TD7Ta irii 


5861 


529 


2315 


4101 


5887 


/O'i^Xfc'ZO 2U4 


5864 


530 


2316 


4102 


5888 


TRjnTDOU One 
/ 04LlrZ£ Z UD 


5865 


531 


2317 


4103 


$8B9 


1 OHK^xVZa Z\Jo 


5871 


532 


2318 


4104 


5890 


IOH.\,Li:AO Z\J f 


5873 


533 


2319 


4105 


5891 


/o^tv-Xir^o 2UB 


5873 


534 


2320 


4106 




/B*CIP2B 209 


5875 


535 


2321 


4107 


5893 




5878 


536 


2322 


4108 




TBAPTMO oil 

/o?wAP2B 211 


5879 


537 


2323 


4109 


5895 




5680 


538 


2324 


4110 


30?0 




5880 


539 


2325 


4111 


5897 


7R4CTD7P 714 


5880 


540 


"' 232$ 


4112 


5898 




5880 


541 


2327 


4113 


5899 


7 B A HTD7P. OK 
/ O^i t*X Jr ZO /ID 


5885 


542 | 


2328 


4114 


5900 


TftdPTDOU 717 


5895 


543 


2329 


4115 


5901 


7 8 4rTP9R 91 P 


3030 


544 


2330 


4116 


5902 




5902 


545 


2331 


4117 




/o*ulJr2B 220 


5904 


546 


2332 


4118 




TQAPTfi'iO -nil 

7B4CIP2B 221 


5918 


547 


2333 


4119 


5905 


TflAPTQTU 0*31 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B 224 


5932 


550 


2336 


4122 


5908 


784CIP2B_22 5 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B 227 


5946 1 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID HO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of con tig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priori ty 






sequence 






application 




030 


2342 


4128 


; 5914 


784CIP2B_232 


5975 


CCT 
33 / 


2343 


4129 


5915 


784CIP2B_233 


5977 


CCD 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B 235 


5979 1 


560 


2346 


4132 


5918 


784CIP2B_236 


5980 


561 


2347 


4133 


5919 


784CIP2B 237 


5988 


562 


2348 


4134 


5920 


784CIP2B__238 


5989 


563 


2349 


4135 


5921 


784CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 


565 


2351 


4137 


5923 


784CIF2B_241 


5998 


566 


2352 


4138 


5924 


784CIP2B 242 


6003 


567 


2353 


4139 


5925 


784CIP2B 243 


6004 


568 




2354 


4140 


5526 


784CIP2B 244 


6013 


569 


2355 


4141 


5927 


784CIF2B_245 


6028 


5 /0 


2356 


4142 


5928 


784CIP2B_246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


784CIP2B 249 


6031 


574 


2360 


4146 


5932 


784CIP2B_2S0 


6032 


575 


| 2361_ 


4147 


5933 


784CIP2B 251 


6037 


576 


2362 


4148 


5934 


784CIP2B 252 


6037 


577 


2363 


4149 


593 5 


784CIP2B_253 


6043 


578 


2364 


4150 


5936 


784CIP2B 254 


6044 


579 


2365 


4151 


5937 


784CIP2B 255 


6046 


580 


2366 


4152 


5938 


784CIP2B 256 


6048 


581 


2367 


4153 


5939 


784CIP2B_257 


6049 


582 


2368 


4154 


5940 


7B4CIP2B 258 


6-6S1 


583 


2369 


4155 


5941 


784CIP2B_259 


6053 


584 


2370 


4156 


5942 


784CIP2B_260 


6060 


585 


2371 


4157 


5943 


784CIP2B_261 


6063 


586 


2372 


4158 


5944 


784CXP2B 26"2 


6066 


587 


2373 


4159 


5945 


784CIP2B_263 


6067 


588 


2374 


4160 


5946 


784CIP2B 264 


6068 


589 


2375 


4161 


5947 


784CIP2B 265 


6073 


590 


2376 


4162 


5948 


784CIP2B_266 


6076 


591 


2377 


4163 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B_268 


6077 


593 


2379 


4165 


5951 


784CIP2B 269 


6079 


594 


2380 


4166 


5952 


784CIP2B_270 


6082 


595 


2381 


4167 


5953 


784CIP2B 2 72 


6088 


596 


2382 


4168 


5954 


784CIP2B 273 


6091 


597 


2383 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B 275 


6101 


599 


2385 


4171 


5957 


784CIP2B_276 


6103 


QUU 


23 86 


4172 


5958 [ 


784CIP2B 277 


6104 


601 


23 87 


4173 


5959 


784CIP2B 278 


6108 


DU4 


2368 


4174 


5960 


784CIP2B_279 


6112 




23 89 


4175 


5961 


784CTP2B_280 


6121 




-s j y u 


4176 


5962 


784CIP2B 281 


6125 


cntz 


2391 


4177 


5963 


784CIP2B_282 


6126 


DUD 


2392 


4178 


5964 


784CIP2B_283 


6128 


607 


2393 


4179 


5965 


784CIP2BJ284 


6124 


27v5 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 [ 


4161 


5967 


784CIP2B_286 


6133 


610 


2396 




coco 


784CIP2B 287 


6135 


611 


2397 


4183 


5969 ; 


784CIP2BJ288 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5571 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5973 


784CXP2B 292 


6148 


616 


2402 


4188 


5974 


784CIP2B 293 


6149 


til 


2403 . " 


4189 


5975 - 


784CIP2B 294 


6149 
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SEQ ID NO: 

length 

nucleotide 

sequence 


SEQ ID 
NO : of 

ZU± x- 

1 engt h 

peptide 

sequence 


SEQ ID NO: 

of contig 
nucleotide 


SEQ ID 
NO: 

of contig 

peptide 


Priority 
docket number^ 
corresponding 

ccvo Tn Kin . J *« 
bJ&Q ID NU : in 

prioricy 

oppA i t-a.U-l.UIl 


SEQ ID 
NO: in 
U.S. S.N. 
09/488 , 725 


618 


2404 


4190 


5976 


784CIP2B 295 


6153 


619 


2405 


4191 


5977 


784CIP2B 796 


Dig? 


620 


2466 


4192 


5978 


70CCIP2B ?97 


Ci c/ 


621 


2407 


4193 


5979 


7B4CIP2R oqq 


a its / 


622 


2408 


4194 


"5980 




OX #4 


623 " 


2409 


4195 


5981 


784PTP7P. "*nn 


OX / j 


624 


2410 


4196 


5982 


7B4CIP2B 301 


CT OA 

oxy u 


625 


2411 


4197 


5983 


TftApTDOO 1A9 


6194 


626 


2412 


4198 


5984 


784CIP2B 303 


CI QC 

o xy o 


627 


2413 


4199 


5985 


— 7fldrTP0ft 7AA — 


6197 


628 


2414 


4200 


5986 


tr,4PTPod "inc; 


6198 


629 


2415 


4201 


5987 


IPATTDIIl OAC 


6198 


630 


2416 


4202 


5988 


/O^LlrAO JUD 


02X4 


631 


2417 


4203 


5989 




6215 


632 


2418 


4204 


5990 


TflAr*TD*5tl 1 1 A 


COT Q 

0219 


633 


2419 


4205" "" 


5991 


*7RAr»TD"Jti 11 1 


6226 


634 


2420 


4206 


RQQO 
z>y J 4. 




6229 


635 


2421 


4207 


5993 




6234 


636 


2422 






/OstiF/H 314 


6237 


637 


2423 


4209 


case 


/H4CIP2B 315 


6238 


638 


2424 


4210 

14 X V 




/H4C1P2B J16 


6239 


639 


2425 


4411 


5997 


784CIP2B 317 


6239 


640 


2426 


... — 


5998 


784CIP2B 318 


6239 


641 


2427 


HZXJ 


' CQQO 


784CIP2B 319 


6240 


642 


2428 


AO 1 A 


6030 


784CIP2B 320 


6244 


643 


2429 


A 0 1 C 


6001 


784CIP2B 321 


6245 


644 


243 0 




6002 


784CIP2B_322 


6250 


645 


2431 


4217 


6003 


784CIP2B 323 


6252 


646 


2435 


4218 


o UU 1 


784CIP2B 324 


6252 


647 


2433 


4219 


t> U l> Z> 


7B4CIP2B_325 


6256 


648 


2434 


4220 


6006 


784CIP2B_326 


6260 


649 


2435 


4221 


o w / 


7o4CIP2B 327 


6261 


650 


243~6 


4222 


DUUy 


7H4CIP2B 328 


6264 


651 


2437 


A OO 1 


6009 


784CIP2B 329 


6265 


652 


2438 


4224 


6010 


784CIP2B_330 


6266 


653 


2439 


400 <5 


6 011 


784CIP2B 331 


6270 


654 


2440 


4276 


£ m o 


/o4CIP2B 332 


<>271 




2441 


4227 


CA1 "J 


/B4.CIP2B 334 


6274 


656 


2442 


422 8 


cai a 


/B4CIP2B 335 


6276 


657 


2443 


4229 


O UX 3 


784CIP2B__336 


6281 


658 


2444 


4230 


6 016 


*7Q/ t n o n ii ♦» 


6281 


6^9 


2445 


4231 


6017 




6288 


660 


£446 


4232 


6013 


TPArTDon iio 


6292 


661 


2447 


4233 


6019 


7fiAr , TD*>tJ iAn 


0 2 94 


662 


2448 


4234 


• 6020 


7BdPTD*)n iii 


GO.T7 - 
b J 12 


663 


2449 


4235 


6021 


784r , TD0B ^44 


Clio 


664 


2450 


4236 


6022 




CO 1 0 
0.312 


665 


2451 


4237 


5023 


/ OiLi r^fl J ^ o 


CI *50 


666 


2452 


4238 


6024 


TfiAfTD^D OAO 


63 24 


667 


2453 


4239 


6025 




ctoQ 

6329 


668 


2454 


4240 


6026 




oi 31 | 


669 


2455 


4241 j 


6027 


7B4CIP2B 351 


6333 


670 


2456 


4242 


C AO Q 


784CIP2B 352 


6334 


671 j 


2457 


4243 


C AO a 


784CIP2B 353 


6337 


672 


2458 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 | 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


784CIP2B 357 


£348 


676" 


£462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 """ 


679 


2465 


4251 


6037 


784CIP2B 361 | 6362 
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*J d \f a. U IXV/; 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO : of 
full- 
length 
peptide 
sequence 


boy ID NO: 
ot concig 
nucleoh i 

sequence 


SEQ ID 
NO : 

peptide 
sequence 


Priority 
docket number_ 
cor re s p ondi ng 

q Pn t n xin . ■» r\ 
■jiy jl u hu: in 

priority 

appl i ca t ion 


SEQ ID 
NO; in 
U.S. S.N. 
Oy/488 t 725 


680 


2466 


42S2 


6038 


784CIP2B 362 


6368 


681 


2467 


4253 


6039 


784CIP2B 363 


6369 


662 


2468 


4254 


6040 


784CIP2B 3^4 


6371 


683 


24*9 


4255 


*041 


784CIP2B 36S 


6376 


684 


2470" 


4256 


6042 


784CIP2B 366 


63 79 


685 


2471 


4257 


6043 


784CIP2B 367 


6360 


686 


2472 


425B 


6044 


784CIP2B 3 68 


6*381 


687 


2473 


4259 


604"5~ 


^84CIP2B 369 


filQO 


688 


2474 


4260 


6046 


784CIP9R 170 


CIDC 
D J 33 


689 


2475 


4261 


6047 


784CIP2R 171 


dj y / 


690 


2476 


4262 


6048 


784CTP3R 175 




691 


2477 


4263 


6049 


784CIP2R 171 


OH UX 


692 


2478 


4264 


6050 


704CIP2R 174 


OH XX 


693 


2479 


4265 


6051 


784CIP7R n«; 




694 


2480 


4266 


6052 


784CIP7R 176 


0411 


695 


2481 


4267 


6053 


7B4PTP3R 177 


olio 


69£ 


2482 


4268 


6054 


7R4CTP9R 17ft 


6418 


697 


2483 


4269 


" 6055 


784PIP2R 179 


6422 


698 


24B4 


4270 


6056* 


7R4TTP7R' iriri 


6423 


699 


24 85 


4271 


605"7" 


— 7l?4T r rp5R~~Tfli 


6426 


700 


h 2486" 


4272 


6058 


7fi4PTDOU ion 


; 6427 


701 


2487 


4273 


6059 




6428 


702 


2498 


4274 ' 


6060 


/o*LH'2B 384 


6429 


703 


2489 


4275 


6061 


/a4L.iJr2o J85 


6430 


704 


2490 


4276 


"6061 


'04U1V^U Job 


6432 


705 


2491 


4277 


6063 




6432 


706 


2492 


4278 


6064 




6438 


707 


2493 


4279 


6065 




6441 


708 


2494 


4280 


6066 




6446 


709 


2495 


4281 


6067 




£ti ca 

6454 


710 


2496 


4282 


6068 




6459 


711 


2497 


4283 


6069 


7A4PTP7R io7"" 




712 


2496 


4284 


6070 




64o / 


713 


2499 


4285 


6071 


7Q/ir , TDOia lot 


6468 


714 


*~ 2500 


4286 


6072 




64 87 


715 


2501 


4287 


6073 


7fldr"TD0R loo T 


6491 


716 


2502 


4288 


6074 


7R4r , TP7R tQQ 


obUo 


717 


2503 


4289 


6075" " " 




6514 


718 


2564 


4290 


6076 




6519 


! 719 


2505 


4291 


6077 




6521 


720 


2506 


4292 


6078 




6532 


~ 721 


2507 


4293 


6079 


7 fl 4 C I P 5 R." "Z ft e; 


gtc-ii* 

03-3 D 


722 


2508 


4294 


6080 


784CTP7R 4ft*; 

' OH C £,D HUD 


6543 


723 


2509 


4295 


6081 


784CIP5R 4D7 


CCA A 


724 


2510 


4296 


6062 


7B4CIP2B 40fl 


(TC/1 O 
DD*i O 


725 


2511 


4297 


6083 


764CIP2B 469 


6551 " 


726 


2512 


4298 


6084 


7B4CIP2B 410 


6551 


727 


2513 


4299 | 


6085 


784CIP2B 411 


6552 


728 


2514 


4300 


6086 


7B4CIP5B 417 




729 


2515 


4301 


6087 


784CIP2B 4 11 




73 0 


2516 


4302 


6088 


784CTP9R 414 


b jdu 


731 - ■ 


2517 


4303 


6089 


7fl4r*TD7n 41 c 


ODD J 


732 


2518 


4304 


6090 


7A4r*TD7n A1 C 


CCCA 

bbb4 


733 


2519 


4305 


6091 | 


7A4PTP7P 417 
'OSLXrZD 41/ 


6567 


734 


2520 


4306 


6092 


784CIP2B 418 


6573 


735 


2521 


4307 


6093 


784CIP2B 419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 ! 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 ; 


738 


2524 


4310 


6096 


784CIP2B_422 


6595 


739 


2525 


4311 


6097 


784CIP2B 423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 


6099 


784CIP2B 425 


6625 
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Hf\ tn \iA 

SEQ ID NO: 
or iuii- 

sequence 


SEQ ID 
NO: of 
ruxx- 
1 eng t h 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 

NO: 

of contig 

peptide 

scyueace 


Priority 
docket number_ 
corresponding 
SEQ ID JSO: In 
priority 


SEQ ID 
NO; in 
U.S .S .N. 
09/488, 725 


742 


2528 


4314 


6100 


764PTD5P. 49£ 


bocb 


743 


252S 


4315 


6101 


7fl4PTP?R 457 


ceo n 
DO J U 


744 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B 429 


6632 


746 


2532 " 


"43i5 


Sl04 " 


784CIP2B 430 


6633 


747 


2533 


""4319 


6105 


784CIP2B 431 


6634 


748 


2534 


4320 


6106 


' 784CIP2B 432 


6638 


749 


2535 " 


" 4321 


6107 


784CIP2B 433 


6"641 


750 " 


" 2536 


4322 


6108 


■ 784CIP2B 434 


6644 


751" 


2537 


4323 


6 109 


784CIP2B_435 [ 6646 


752 


2538 


4324 




784CIP2B_436 6648 


753 


2539 


4325 


6111 


784CIP2B_437 j ^52 


754 


2540 


4326 


6112 


784CIP2B 438 


6654 


755 


2541 


4 327 


6113 


784CIP2B 439 


6657 


756 


2542 


4328 


D J. J. ** 


784CIP2B 440 


6658 


757 


2543 






7B4CIP2B_441 


6663 


758 


2544 




0 X JLb 


784CIP2B 442 


6664 


•759 


2545 


/tin 


cm 


784CIP2B 443 


6668 


760 


2546 




olio 


7B4CIP2B_444 


6669 


761 


2547 


4 j J J 


6119 


784CIP2B 445 


6673 


762 


"7 CA Q 


4334 


6120 


784CIP2B 446 


6685 


763 




ITfc" 

4 J Jb 


g-i 01 


784CIP2B 447 


6687 


764 


2550 


4*3l£ 


bi22 


784CIP2B 448 


6689 


765 


2551 


/ 


bJL2J 


7 84CIP2B_449 


6693 


766 


2552 


" " 4Hfl 
* J JO 


bj.44 


/H4CIP2B 450 


6698 


767 


2553 






'B4CIP2B 451 


6699 


76*8 


£££4 


4340 


6126 




6705 


769 


2555 


4 J 4 J. 


ol27 


J o4CIt , ZB_453 


6711 


770 


7*%cc 


4342 


6128 


7flirTD*)B A CA 
f 04U±Jr Zo 454 


6713 


771 


2557 


4 .3 4 J 


612 9 


/ OiLlrza 455 


6716 


772 




4344 


613 0 


/ O^Llr^fl 430 


6725 


773 




4345 


6131 


TRAPTOB ACT 
/ 04V-1 JrZC 4D / 


672 6 


774 


2560 


4346 


6132 


TflAfTDOB .icq 
/ 0 4 U.-L ±r <ii B 4 DO 


— ■ — cTvi 


775 




4347 


6133 


1 04 L.XFZ& 437 


b/30 


776 


256*2" 


4*5 4o 


6134 


f O *t v» JL tr ^ C5 1 O (J 


673 0 


777 


2563 


A 1 A Q 


c i -x c 


7A4<**TP5R 4fi1 
/ otuir^o 401 


673 0 


778 


2564 


A** en 


CI 1C 


784CIP2B 462 


6732 


779 


2565 


A1 d 


OlJ / 


784CIP2B 463 


6733 


780 


2566 


4352 


bijo 


784CiP2B 464 


6737 


781 


2567 


4353 


6139 


784CIP2B_465 


6745 


782 


2568 


4354 


6140 


784CIP2B_466 


6751 


783 


2569 


43 55 


6 141 


784CIP2B 467 


6754 


784 


2570 


4356 


K1 49 
O 14 X 


784CIP2B_468 


6758 


785 


2571 


4 3 57 


C14'5 


784CIP2B 469 


6761 


786 


2572 


4358 


6144 


784CIP2B 470 


6765 


787 


2573 


4 1 5 Q 


Ct AC 


784CIP2B_471 


6768 


788 


2574 


4360 


514 6* 


784CIP2B_472 


6773 


789 


2575 


4361 


6147 


784CIP2B_4 73 


6776 


790 


2576 


4362 




784CIP2B_474 


6796 


791 


2577 


43 63 




784CIP2B 475 


6798 


792 


.2578 1 


43 64 


6150 


784CIP2B 476 


6823 


793 


2579 


4365 


Cl ci 


784CIP2B_477 


6825 


794 


2580 


43 66 


6152 


784CIP2B 478 


6826 


795 


2581 


4367 


CI CI 
Dl J J 


784CIP2B 479 


6839 


796 


2582 


4368 


6154 


784CIP2B_480 


6844 


797 


2583 


436"9 


61S5 


764CIP2B 482 


6849 


798 


2584 


4370 


6156 


784CIP2B_4 83 


6854 


799 


2585 


4371 


6157 


784CIP2B_484 


6857 


800 


2586 


4372 


6158 


784CIP2B 485 


6861 


801 


2587 


4373 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 


6161 


784CIP2B 488 


6877 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


NO: of 
full- 
length 
peptide 
sequence 


c c?n m mo • "~ 

o£ frontier 

nucleotide 
sequence 


SEQ ID 
NO: 

Cif /"»rtn fin 

peptide 
sequence 


Priority 
docket number 
corre spond i ncj 
SEO ID NO* In 

priority 
application 


SEQ ID 
NO: in 
U.S ,S .N. 

\J J / TOD, /^D 


804 


2590 


4376* * 


6162 


784CIP2B 489 


6880 


805 


2591 


4377 


6163 


784CIP2B 490 


6885 


806 


2592 


4378 


6164 


784CIP2B 491 


6890 


807 


2593 


4379 


6165 


784CIP2B 492 




808 


2594 


4380 


""■ 616£ 1 


784CIP2B 493 


6894 


809 


'"" 259ST 


4381 


6167 


784CIP2B 494 


6901 


810 


2596 


4382 


6168 


784CIP2B 495 


6904 


811 


2597 


4383 


6169 


784CIP2B 496 


690 7 


812 


2598 


4384 


6170 


784CIP2B 497 


6914 


813 


2599 


4385 


6171 


784CIP2B 498 


6917 


814 • 


2600 


4386 


6172 


784CIP2B 499 


£923 


815 


2601 


4387 


6173 


784CIP2B 500 


5929 ' 


816 


2602 


4388 


517-4 


784CIP2B 501 


6931 


817 


2603 


4389 


6175 


" 7B4CIP2B 502 — 


CQTC 1 ■ — 1 
D7J 3 


818 


2604 


4390 


617£ 


784CIP2B 5 03 


694 0 


819 


2605 


4391 


5177 


784CTP7R qnd 


by*t 3 


820 


2606 


4392 


6178 


784CIP2B ^05 




B21 


2607 


4393 


6179 




6947 


822 


2608 


4394 


6180 




694 9 


823 


2609 


4395 


6181 


784C7P7R ^Oft 


6959 


824 


2610 


4396 • 


6182 


7R4hTP'3R £A6 " 


&9&U 


825 


2611 


4397 


£183" 


784c5p2r cm — 


6962 


826 


2612 


439B 


6184 — 


784CTP9R K1 1 


6963 


827 


2613 


4399 


6105 


/OiHrjsD Die 


6967 


828 


2614 


4400 


6186 


784CTP7R mi 


doai 


829 


2615 


4401 


6137 




ca'da 


830 


2616 


4402 


6138 


7fl4fTP9R ci c 


6996 


831 


2617 


4403 


6189 




7003 


832 


2618 


4404 


6190 


7H4r , TP5P e.1 7 


7016 


833 


2619 


4405 


6191 


f ot^lr£D 3io 


7017 


834 


2620 


4406 


6192 




tTvTc 

7025 


835 


2621 


4407 


6193 


784CTP9R e.">ft i 


7025 


836 


2622 


4408 


6194 




-moa 


837 


2623 


4409 


6195 


784f , TP3R C.79 


•inert 


838 


2624 


4410 


£l96 


/ O ** X tr <r4 D 


7051 


839 


2625 


4 411 


6197 




7055 


840 


2626 


4412 


6198 


7fl4PTP"?R coc 
/ O T* V. a tr *5 D 3/3 


7060 


841 


2627 


4413 


6199 


7B4CIP9R 


f Ub4 


842 


2628 


4414 


6200 


784CIP2B 527 


/WO f 


643 


2629 


4415 


6201 


784CIP2B 528 


f U rl 


844 


5630 


4416 


6202 


784CIP2B 529 


7072 


845 


2631 


4417 


6203 


784CIP2B 530 


7073 "" 


846 


2632 


4418 


6204 


784CIP2B 5*31 


7Q7£ 


847 


2633 


4419 


£205 


784CIP2B 532 


70 88 


848 


16-34 


4420 


6206 


784CIP2B 533 


' U 0 3 


849 


2635 


4421 


6207 


784CIP2B ^"^4 




850 


2636 


4422 


6208 


784CIP2B 535 


7091 


851 


2637 


4423 


6209 


784CIP2B 536 


7104 


852 


26"38 


4424 


6210 


784CIP2B 537 " 


7105 


853 


2639 


4425 


6211 


784CIP2B 538 


7105 


854 


2640 


4426 


6212 


784C3P2B 539 


7109 


855 


2641 


4427^ "" 


6213 


784CIP2B 54"6 


7109 — 


856 


2642 


4428 


6214 


784CIP2B 541 


711 9 


857 


2643 


4429 


6215 


784CIP2R 549 


71 2 0 


858 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 


6217 


784CIP2B 5*44 


7126 


860 


2646 


4432 


6218 


784CIP2B 545 


7127 


861 


2647 


4433 


6219 


7B4CIP2B_546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


443* 


£221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 | 4437 


6223 


784CIP2B 550 


7163 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




866 


2652 


4438 


6224 


7B4CIP2B_551 


7175 


867 


2653 


4439 


6225 


784CIP2B 552 


7188 


868 


2654 


4440 


6226 


784CIP2B 553 


7189 


869 


2655 


4441 


6227 


784CIP2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B_555 


7191 


871 


2657 


4443 


6229 


784CIP2B_556 


7203 


872 


2658 


4444 


6230 


784CIP2B_557 


7204 


873 


2659 


4445 


6231 


784CIP2B_55B 


7208 


874 


2660 


4446 


6232 


784CIP2B_559 


7209 


875 


2661 


4447 


6233 


784CIP2B_560 


7210 


876 


2662 


4448 


6234 


784CIP2B 561 


7216 


on-) 

877 


2663 


4449 


6235 


784CIP2B_562 


7221 


878 


2664 


4450 


6236 


784CIP2B 563 


7230 


879 


2665 


4451 


6237 


784CIP2B_564 


7237 


880 


2666 


4452 


6238 


784CIP2B_565 


7240 


881 


2667 


4453 


6239 


784CIP2B 566 


7245 


882 


2668 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B 568 


7251 


884 


2670 


4456 


6242 


784CIP2B 56"9 




885 


2671 


4457 


6243 


784CIP2B 570 


7260 


886 


2672 


4458 


6244 


784CIP2B_571 


7265 


887 


2673 


4459 


6245 


784CIP2B 572 


7268 


888 


2674 


4460 


6246 


784CIP2B_573 


7275 


889 


2675 


4461 


6247 


784CIP2B 574 


7279 


890 


2676 


4462 


6248 


784CIP2B_575 


7283 


851 


2677 


4463 


6245 


78 4CIP2B_576 


7283 


892 


2678 


4464 


6250 


7B4CIP2B_577 


7287 


893 


2679 


4465 


6251 


784CIP2B_578 


7301 


894 


2680 


4466 


6252 


784CIP2B_579 


7308 


895 


2681 


4467 


6253 


784CIP2B 580 


7308 


896 


2682 


4468 


6254 


784CIP2B 581 


7309 


897 


2683 


4469 


6255 


784CIP2B_582 


7*19 


898 


2684 


4470 


6256 


784CIP2B_583 


7320 


899 


2605 


4471 


6257 


784CIP2B 584 


7326 — 


900 


2686 


4472 


6258 


784CIP2B_585 


7326 


901 


2687 


4473 


6259 


784CIP2B_586 


7334 


902 


2688 


4474 


6260 


784CIP2B_587 


7337 


903 


2689 


4475 


6261 


784CIP2B_588 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 


4477 


'6263 


784CIP2B 590 


7355 


90b 


2692 


4478 


6264 


784CIP2B_591 


736*3 


907 


2693 


4479 


6265 


784CIP2B 592 


7363 


908 


.2694 


4480 


6266 


784CIP2B_593 


7365 


909 


2695 


4481 


6267 


784CIP2B_594 


736B 


910 


2696 


4482 


6268 


784CIP2B 595 


7369 


911 


2697 


4483 


6269 


784CIP2B 596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B_601 


7383 


915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B 603 


7391 


917 


2703 


4489 


627S 


784CIP2B__604 


7393 


918 


2704 


4490 


6276 


784CIP2B 605 


7395 


919 


2705 


4491 


6277 


7B4CIP2B_606 


7397 


920 


2706 


44 92 


6278 


784CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 


922 


2708 "-■ 


4494 


6280 


784CIP2B 609 


7406 4 


923 


2709 


4495 


6281 


7B4CIP2B_610 | 


7406 


924 


2710 


4496 


6282 


784CIP2B_611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B 6*13 


- "7411 " 


927 


2713 


4499 


6285 


784CIP2B_614 


7417 
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SEQ ID NO: 
of full- 
xengcn 

J1UC-L OULluc 


SKQ ID 
NO: of 
tux 1- 
length 

non<' i /Am* 


SSQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


1 Priority 
docket number__ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


928 


*• f 


fi JVJU 


C 0 0 c 


7B4CIP2B_615 


7418 


929 


2715 1 ~ " 


,**DUJL 


O Z □ / 


/o4CII'23 olo 


7421 


930 


2716 


" 4502 




/U4CIP2B 617 


7422 


931 


2 717 






/o4CIr23 618 


7422 


932 


2718 


4504 




/04C1P2B 619 


7423 


933 


2719 


4505 


6291 


/o4t,iJr23 o2U 


7424 


934 


2720 


4506 






7426 


335 


2721 




/: O n -a 


/04CJP23 622 


7427 


936 


2722 


- ■ 450Q 




/<J4C.I.P2B 623 


7428 


937 


2723" 






fa4CiP2a 624 


7430 


938 


2724 








7435 


939 


2725 




bZy J 


/84CIP2B 626 


7437 


940 




43-12 


6298 


784CIP2B 627 


7439 


941 


7777 


act -j 


6299 


784CIP2B 628 


7440 


942 '" 


2728 


4b-L4 


6300 


784CIP23 629 


7442 


943 


2729 


4515 


6301 


784CIP2B 630 


7450 


94 4 




4516 


6302 


784CIP23_631 


7451 


945 


7 TJl 


4517 


6303 


784CIP2B 632 


7452 


946 


A /OA 


4518 


6304 


784CIP23_633 


7454 


94 7 


2733 


4519 


6305 


784CIP2B 634 


7457 


94 8 


2734 


4520 


6306 


784CIP2B 635 


7459 


949 




4521 


6307 


784CIP2B 636 


7461 


950 


« / 3D 


4522 


63 08 


784CIP2B_637 


7463 


951 


7777 


4523 


6309 


784CIP2B 638 


7466 


952 


z / j a 


4524 


6310 


784CIP2B_639 


7469 ! 




*> *7 •a a 
Z / 39 


4525 


6311 


784CIP2B_640 


74 73 


954 


2740 


4526 


6312 


7H4CIP2B_641 


7481 


955 " 


7741 


4527 


6313 


784CIP2B_642 


7482 




2742 


4528 


6314 


784CIP2B 643 


74 82 


'957 


7 7A^ 


4529 


6315 


784CIP2B 644 


7483 




2744 


4530 


6316 


784CIP2B_645 


7485 


""359 


7 "7 A c 


4531 


6317 


784CIP2B — 646 


7486 


9"6li 


2746 


4532 


63 18 


784CIP2B 647 


7487 


"961 


2747 


4533 


6319 


784CIP2B_648 


7491 


962 


O *7/l ft 


4534 


6320 


784CIP23_649 


7492 


963 


2749 


4535 


6321 


784CIP2B_650 


7494 


964 


27^6 




6322 


784CIP23_651 


7498 


965"" " 




4bJ7 


6323 


?84CIP2B_652 j 


7504 


966 


2752 




6324 


784CIP23 653 


7508 


967 


2753 — '" 


4^"*Q 


6325 


784CIP2B 6S4 


7516 


968 


2754 — " 




cTTZ — 


784CIP2B 655 


75'l8 


969 


2755 


ACA 1 


632 7 


784CIP2B 656 


7519 


970 


2756 




6328 


784CIP2B 657 


7521 


971 


2757 




6329 


784CIP23_658 


7529 


972 


2758 




6330 


784CIP2B 659 


7532 


973 


2759 




6331 


784CIP23_660 


7533 


974 


2760 


454 6 


6332 


784CIP2B_661 


7535 


975 


2751 


ton t 


6333 


784C1P2B_662 


7545 


976 


2762 


4548 


CI 1 A 


/D4CIP23 663 


7546 


977 


276 J 3 


A C/ Q 
*i 3 4 3 


die 
fa J 3 i> 


/84CIP2B 664 


7552 


978 


2764 




633 6 


/o4CIP2B 665 


7554 t 


979 


2765 


4551 


6337 


784CIP2B__666 


7567 


980 


2766 




633 8 


/H4CIP23 667 


7569 


981 


276*7 




6339 


784CIP2B 668 


7575 


982 


2768 


4554 


6340 


784CIP23 669 


7576 


983 


2769 


4555 


6341 


784C1P23_670 


7577 


" 984 


2770 


4556 


6342 


784dlP2B e^^i 


7579 


985 


2771 


4557 


6343 


784CIP23 672 


7582 


986 


2772 


4558 


6344 


784CIP2B 673 


7587 


987 


2773 


4559 


6345 


784CIP23 674 


7589 


988 


2774 


4560 


6346 


784CIP2B 6*7$ 


7597 


989 


2775 


4561 


6347 


784CIP2B 676 


" 7597 
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&Cu XU INU ' 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
or 

peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleot ids 


SEQ ID 
NO : 

of contig 
pepciQe 
com lonpo 


Priority 
docket number^ 
corresponding 
ofcy id NO: in 
priority 

dppjllCdtlOU 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


990 


2776 


4562 


6348 


7B4f , TP7R fi77 
' otVuir^o Off 


/buy 


991 


2777 


4563 


6349 


7fiflPTP?R C7fl 
1 ut\.ir«D 0 / 0 


tc no 
/buy 


992 


2778 


4564 


6350 


794fTP7P Ctq 


/bus 


993 


2 779 


4565 


6i5l 


7B40TP7P cpn 


/oJLJ 


994 


2780 


4566 


6352 


784CIP23 Sfll 




995 


2781 


4567 


6353 


784CIP23 6fl7 


7J?7 Q 


996 


2782 


4568 


6354 


784CIP7P £TR"? 

' U *3 V_ J. JET ^ X3 DOJ 




997 


2783 


4569 


£355 


7R4PTP7R GRd 

' y U >w 1 r Z D v O *J 


/ b J j 


998 


2784 


4570 


6356 


784PTP9R fiRR 


/ OJD 


999 


2785 


4571 


6357 


/0**C-Lr r £D ODD 


763 8 


1000 


278ff _ 


4572 


6358 


7R4rTP7P GR7 
* OH <~ X tr Ad 00/ 


763 9 


1001 


2787 


45^ 


6359 


7ft/t <**TD*>n Coo 


7646 


1002 


2788 


4574 






7647 


1003 


2709 


4575 


6361 


f oiLif^b 590 


7648 


1004 


2790 


4575 




/o4CIf2B 691 


7658 


1005 


2791 


4577 





; /o4C1P2B_692 


7664 


1006 


2792 


4578 


6364 




7664 


1007 


2793 


4579 


6365 


TB/i ptdiu ezac 
/o4Clr2B 695 


7674 


1008 


2794 


4580 


DjOO 


fo4ulP2B 696 


7675 


1009 


2795 


4581 


636*7 


/o4Cl£ , 2B 697 


7676 


1010 


2796 


4582 


6368 


'04L-1P2B 698 


7681 


1011 


2797 


4583 




7B4CIP2B 699 


7688 


1012 


2798 


4584 


6370 




/6 9 3 


1013 


2799 


4585 


oj fx 


/84CIP2B 701 


7694 


1014 


2800 




0*70 

b J /2 


784CIP2B 702 


7715 


1015 


2801 


4587 


bj / J 


7B4CIP2B 703 


7716 


ioie 


2802 


4588 


6374 


784CIP2B 704 


7718 


1017 


2803 


45B9 


6375 


784CIP2B 705 


7721 


1018 


2804 


4 590 


0 J /o 


7 84CIP2B_7 06 


7723 


1019 


286£ 


4 591 


6377 


784CIP2B 707 


7729 


1020 


2806 


4592 


6378 


784CIF2B 708 


7733 


1021 


2807 


4593 




784 CIP2 B_7 0 9 


7735 


1022 


2806 


4594 


6380 


7 o4 y—XrZM /1U 


7741 


1023 


2809 


4595 


£1 Q 1 


/o4LlP2B_ 711 


7743 


1024 


2810 


4596 


CIO-) 


7B4CIP2B 712 


7748 


1025 


2811 


4597 


6383 


/o4Clr2B 713 


7749 


1026 


2812 


4598 


63 84 


/o4Clf2B 714 


7750 


1027 


2813 


4599 


6385 


TBAPTD*)n fir 


7757 


1028 


2814 


4600 




/o4CAP2B / 16 


7759 


1029 


2815 


■ 4601 


6387 


r 2B / 1 / 


7760 


103 0 


2816 1 


4602 


63 88 


/ 0 1 u. X f2 a fx a 


7760 


1031 


2817 


4603 


6389 


/B4V.J.F2B 719 


7764 


1032 


2818 


4604 


6390 


/o4u±r2b /20 


7765 


1033 


2819 


4605 


63 91 


/o4vlr2B 721 


7766 


1034 


2820 


4606 




/o4t.ir2B 722 


7767 


1035 


2821 


46*0^' ' 


6393 


*7*7"1 

/0*il^Xf^JU / 2 J 


7769 


1036 


2822 


4608 


6394 




7770 


1037 


2823 


4609 


6395 




7774 


1038 


2B24 


4610 


6396 


7RdPTP7R 77C 


/ / 75* 


1039 


2B25. 


4611 


6397 


7R4PTP5P 777 


/ /OJ. 


1040 


2826 


4612 


6398 


7R4PTD70 Tin 


7782 


1041 


2827 


4613 


6399 


TfldfTDTJn 700 


7783 


1042 


2828 


4614 


6400 


' O 4 Iw. J. r 4.D /JU 


7787 


1043 


2829 


46*15 


6401 




7792 


1044 


2830 


4616 


6402 


784CIP2B 732 


7795 


1045 


2831 


4617 


64 03 


784CIP2B 733 


7801 


1046 


2832 


4618 


6404 


784C1P2B 734 


7807 


1047 


2833 """ 


" 4619 


4405 


784CIP23 735 


7808 


1048 


2834 


4620 


6406 


784CIP23 736 


7819 


1049 


2835 


4621 


6407 


784CIP2B_737 


7824 


1050 


283^ 


4622 


6408 


784CIP2B 738 


7826 


1051 - ■ 


2837 


4623 


6409 


784CIP2B 739 


7829 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


seO to 

NO: of 
full- 
length 
peptide . 
sequence 


SEQ ID NO: 
of Contig 
nucleotide 
sequence 


OE*\J 1 U 

NO: 

of contig 

peptide 

sequence 


Priority 

UDCKci nUlTLDCX 

coiresnondi no 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 

TT <3 C W 
U - O • o , Is . 


1052 




4624 


6410 


784CIP2B_74 0 


7832 


1053 


2839 


4625 


6411 


784CIP2B 741 


7839 


1054 


2840 


4626 


6412 


784CIP2B 743 


7847 


1055 


2841 


4627 


6413 


784CIP2B 744 


7B48 


1056 


! 2647. 


4628" 


6414 


784CIP2B 745 


7853 


1057 


2843 


4629 


6415 


784CIP2B 746 


7854 


1058 


2844 


4630 


6416 


" 784CIP2B 747 


7856 


1059 


2845 


4631 


6417 


784CIP2B 748 ,, '" 


7862 


1060 


2846 


4632 


"™ ' 6413 " 


" 784CIP2B 749 " 


7865 


1061 


2847 


4633 


6419 


784CIP2B_750 


7874 


1062 


2848 


4634 


6420 


784CIP2B 751 


7877 


1063 


2849 


4635 


6421 


784CIP2B 752 


7880 


1064 


2850 


4636 


6422 " 


784CIP2B 753 


7882 


1065 


2851 


4637 


6423 


784CIP2D 754 


7884 


1066 


2852 


4638 


6424 


784CIP2B 755 


7886 


1067 


2853 


4639 


6425 


784CIP2B 756 


7888 


1068 


2854 


4640 


642* 


784CIP2B 757 


7889 


1069 


2855 


4641 


6427 


784CIP2B 758 


7901 


1070 


2856 


4642 


6428 


784CIP2B 759 


7910 


1071 


2857 


4643 


6429 


784CIP2B 760 


7911 


1072 


2858 


4644 


6430 


784CIP26 761 — 


792-1 


1073 


2859 


4645 


6431 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


784CIP2B 763 


7924 


1075 


2861 


4647 


6433 


784CIP2B 764 




1076 


2862 


4648 


6434 


784CIP2B 765 




1077 


2863 


4649 


6435 


784CIP2& 7"(j6 


705Q 


1078 - 


2864 


4650 


6436 


784CIP2B 767 


119 JU 


1679 


2865 


4651 


6437 


784CIP2B 768 




1080 


2866 


4652 


6438 


784CIP2B 7(>9 


793 8 


1081 


2367 


4653 


6439 


784CIP2B 770 


f J H A 


1082 


2868 


4654 


6440 


784CIP2B 771 


7Q4< ' 
f j 


1083 


2869 


4655 


6441 


784CIP2B 772 




1084 


2876 


4656 


6442 


784CIP2B 773 


794 8 


1085 


2871 


4657 


6443 


784CtP2B 774 


7951 


| 1086 


2872 


4658 


6444 


784CIP2B 775 


7952 


' 1067 


2873 


4659 


6445 


784CIP2B 776 


7953 


1088 


2874 


4C60 


6446 


784CIP2B 777 


7954 


1089 


2875 


4661 


6447 


784CIP2B 778 


7957 


1090 


2876 


4662 


6448 


784CIP2B 7i$ 


7958 


1091 


2877 


4663 


S449 


784CIP2B 780 


7961 


1092 


2878 


4664 


6450 


784CIP2B 781 


7965 


1093 


2879 


4665 


6451 


7B4CIP2B 782 


7966 


j 1094 


2880 


4666 


6452 


784CIP2B 783 


7979 


1095 


2881 


4667 


6"453 


784CIP2B 784 


7986 


1096" 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


784CIP2B 786 


7988 


1098 


2884 


4670 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


6457 


784CIP2B 78B 


7992 


1100 


2886 


4672 


6458 


784CIP2B_789 


7992 


1101 


2887 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6450 


7B4CIP2B 791 


7992 


1103 


2889 


4675 


6461 


784CIP2B 792 


8003 


1104 


2890 


4676 


6462 


784CIP2B 793 


8014 


1105 


2891 


4677 


6463 


784CIP2B 794 


8015 


1106 


2892 


4678 


6464 


784CIP26 795 


8016 


1107 


2893 


4679 


*4$5 


7B4CIP2BJ796 


8017 


1108 


28$4 ""- 


4680 


6466 


784CIP2B__797 


8019 


1109 


2895 


4681 


6467 


784CIP2B 798 


8020 


1110 


2696 


4682 


6468 


784CIP2B 799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B 801 


8028 


U13 


2899 


4685 


6471 


784CIP2B 802 


6030 
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b&Q ID NO: 

OX XUJLX- 

nucleofcidp 

sequence 


SEQ ID 
NO: of 

fit! 1 
XUJ.1- 

length 
sequence 


SEO ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
dppj.icacion 


SEQ ID 
NO: in 
O. S.S.N. 
09/488,725 


1114 


2900 


4686 


6472 


/ o^v.ir^a 0 U J 


8038 


1115 


2901 


4687 


6473 




8042 


1116 


2902 


4688 


6474 


*7Piir , TD'5H one 


8045 


1117 


2903 


4689 


6475 


TRZrTDOO Qt\C 
/04^XirZt3 bvt> 


8045 


1118 


2904 


4690 


6476 




8046 


1119 


2905 


4691 


j 6477 


TftdPTDOR O n D 


8047 


1120 


2906 


4692 


6478 




8051 


1121 


2907 


4693 


6479 


HQ A nTDTD Q 1 n 
/o4LJ.rzo olU 


8059 


1122 ~ 


2908 


"4"S94 


£4 8 0 


784CIP2B 811 


8064 


1123 


2909 


469S 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


fid 0^ 


784CIP2B 813 


8074 j 


1125 ■"■ 


""" 2911 ""' 


4697 




784CIP2B 814 


8077 


1126 


2912 






784CIP2B_815 


8078 


1127 


2913 


4 699 


CA A c 


784CIP2B 816 


8079 


1126 


2914" "" 


4700 




j 784CIP2B 817 


8084 


1129 


2915 


47m 


0*0 / 


784CIP2B 818 i 8088 


1130 


2916 


4 702 




784CIP2B 819 


8090 


1131 


2917 


4703 


CAQ Q 


784CIP2B 820 


8091 


1132 


2918 




6490 


784CIP2B 821 


8099 


1133 


2919 


Anne. 
** 1 us 


6491 


784CIP2B 822 


B099 


1134 


2920 


4706 




784CIP2B 823 


8100 


1135 


2921 


t f\J 1 


6493 


784CIP2B_824 


6102 


1136 


2922 


a ir\Q 
t /uo 


6494 


784CIP2B_825 


8103 


1137 


2923 


4709 


6495 


784CIP2B 826 


8103 


1136 


2924 


a *7i n 

1 I JLU 


6496 


784CIP2B 827 


8104 


1139 


292 5 


A T1 1 


6497 


784CIP2B 828 


8108 


1140 


2926 




6498 


784CIP2B 829 


8110 


1141 


2927 


All -a 


6499 


784CIP2B 83 0 


8116 


1142 


2928 




eenn 


784CIP2B 831 


8117 j 


1143 


2929 


4 "71 C 


ot>Ul 


784CIP2B 832 


8123 


1144 


""~2"53 0 


4 716 


6502 


7B4CIP2B_833 


813 0 


1145 


2931 


4 71 7 


6503 


/84CIP2B 834 


8130 


1146 


2932 


4718 


trend 


784CIP2B_835 


6143 


1147 


2933 


4719 


6 d0 5 


7B4CIP2B 836 


8143 


1148 


2934 


4720 




784CIP2B_837 


8154 


1149 


2935 


A 7 '51 


6507 


784CIP2B_838 


8155 


1150 


2936 


4*7*>*> 


03UO 


784CIP2B 839 


8162 


1151 " 


2937 


1 / £. J 


6509 


784CIP2B 840 


81^3 


1152 


2938 


4 724 




784CIP2B_841 | 


8172 


11^3 


2939 




' CC1 1 " 


7B4CIP2B 842 


8173 


1154 


2940 


4726 — 




7B4CIP2B 843 


8119 


1155 


2941 


4 727 


fCi ■a" 


784CIP2B 844 


B182 


use 


2942 


4728 


OOJLfl 


784CIP2B 845 


8183 


! 1157 


2943 


4729 


CC1 c 


784CIP2B 846 


8184 


1158 


2944 


4 730 


OjlD 


784CIP2B 847 ~~ 


8185 


1159 T 


2945 


4 731 


6517 


784CIP2B 848 


8187 


1160 


2946 


4732 


6518 


784CIP2B 849 


8188 


1161 


2$47 


4733 


6519 


784CIP2B 850 


8190 


1162 


2946 


4 734 


6520 


784CIP2B 851 


8190 


1163 


2949 


4735 


O D£ JL 


" 784CIP2B 852 " 


8192 


1164 


2950 


4 736 


6522 


784CIP2B 853 


8193 


1165 


2951 


4737 


6523 


784CIP2B 854 


8197 


1166 


2952 


4738 




784CIP2B_855 


8197 


1167 


2953 


4739 




784CIP2B 856 


8199 


1168 


2954 


4740 


6526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B_858 


8203 


1170 1 


2956 


4742 ! 


6528 


7B4CIP2B_859 


8208 


1171 


2957 


4743 


. 6529 


764CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B B61 


8211 


1173 


2959 


4745" 


6531 


784CIP2B 862 " 


8214 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 


1175 ; 


2961 


4747 


6533 


784CIP2B 8*4 ' 


8223 j 
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SEQ ID NO; 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ Id 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1± fO 


■ 2962 


4748 


6534 


7 84CIP2B_B65 


8224 


1177 


* y 


4749 


6535 


784CIP2B_866 


8226 • 


11 ro 


2964 


4750 


! 6536 


784CIP2B 867 


8227 




2 965 


4751 


6537 


784CIP2B_868 


8229 


1180 


2966 


4752 


6538 


784CIP2B_869 


8232 


1181 


2967 


4753 


6539 


784CIP2B_B70 


8236 


1182 


2968 


4754 


6540 


784CIP2B 871 


8239 


llOJ 


2 969 


4755 


6541 


784CIP2B 872 


8244 


1184 


2970 


4756 


6542 


784CIP2B 873 


8245 


1103 


2971 


4757 


6543 


784CIP2B_874 


8248 


llbb 


2972 


4758 


6544 


784CIP2B 875 


8251 


1 l ft "> 
110 / 


2973 


4759 


6545 


784CIP2B 876 


8253 


1186 


2974 


4760 


6546 


784CIF2B_877 


8260 


lit) J 


2975 


4761 


6547 


784CIP2B 878 


8262 


1 1 5 A 

i iy u 


2976 


4762 


6548 


784CIP2B 879 


8268 


1191 


2977 


4763 


6549 


784CIP2B 880 


8270 


1192 


2978 


4764 


6550 


784CIP2B_88l 


8272 


1193 


2979 


4765 


6551 


784CIP2B_8 82 


8274 


1194 


2980 


4766 


6552 


784CIP2B 883 


8274 


1195 


2981 


4767 


6553 


784CIP2B 884 


8275 


1196 


2982 


4768 


6554 


784CIP2B 885 


8277 


1197 


2983 


4769 


6555 


784CIP2B_886 


8281 


1198 


2984 


4770 


6556 


784CIP2B &87 


8283 


1199 


2985 


4771 


6557 


784CIF2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B_889 


8295 


1201 


2987 


4773 


6559 


784CIP2B_B90 


8300 j 


1202 


2988 


4774 


6560 


784CIP2B 891 


8303 


1203 


2989 


4775 


6561 


784CIP2B 892 


8304 


1204 


2990 


4776 


6562 


784CIP2B893 


8305 


1205 


2991 


4777 


6563 


784CIP2B 894 


8309 


1206 


2992 


4/78 


6564 


784CIP2B 895 


8318 


1207 


2993 


4779 


6565 


784CIP2B 896 


8319 


1208 


2994 


4780 


6566 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_898 


B322 


1210 


2996 


4782 


6568 


784CIP2B_899 


8323 


1211 


2997 


4783 


6569 


784CIP2B 900 


8325 


1212 


2998 


4784 


6570 


784CIP2B 901 


8331 


1213 


2999 


4785 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784CIP2B_903 


8333 


1215 


3001 


4787 


6573 


784CIP2B_904 


8335 


± <i ± t> 


3002 


4788 


6574 


784CIP2B 905 


8336 


121 7 


3003 


4789 


6575 


784CIP2B_906 


8337 


T ? T ft 


3004 | 


4790 


6576 


784CIP2B_907 


8340 


1219 


3 005 


4791 


6577 


784CIP2B 908 


8343 


1220 


*a n A c 


4792 


6578 


784CIP2B_909 


8347 


l«s «c 1 


3007 


4793 


6579 


784CIP2B 910 


8349 


1222 


3 008 


4794 


6580 


784CIP2B 911 


8351 




3 009 


4795 


6581 


784CIP2B 912 


8353 


1224 


JU1U 


4 796 


6582 


784CIP2B 913 


B3S5 


lfi£ 3 


3011 


4797 


6583 


784CIP2B 914 


8361 


1226 


3 012 


4798 


6584 


784CIP2B_915 


8365 


/ 


3013 


4799 


6585 


784CIP2B_916 


8367 




i014 


4800 


6586 


784CIP2B_917 


83 69 


l£^ 3 


3015 


4801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


C Cfl ft 


7o4CIP2B_9ZO 


8387 


1231 


3017 


4803 


• 6589 


784CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B 922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B 924 


83 94 


1235 


3021 


4807 


6593 


784CIP28_925 


B395 


1236 


3022 


4808 


6594 


784CIP2B_92 6 


8396" 


1237 


3023 ' 


4809 


6595 


784CIP2B 927 


8398 
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SEO ID MD • 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contlg 

IlULJLCULlUS 

sequence 


SEQ ID 
NO: 

of con tig 

nonf i /In 

y*=fc> u me 

corn ion r>o 


Priority 
docket number^ 
corresponding 

b&U IV NQ: in 
ptioricy 

aiQDl i cat- -J r>n 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1238 


3024 


4810 


6596 


784CTP9R Qoo 




8402 


1239 


3025 


4811 


6597 


784CIP2B 979 




1240 


" 3026 


4812 


6598 


784CIP2B 910 


O'iUJ 


1241 


3027 


4813 


6599 


784CIP2B 911 


a/ nc 
o4 Uo 


1242 


3028 


4814 


6600 


784CIP2B 91*7 


04 09 


1243 


3029 


4815 


6601 


784CIP7B 911 


6410 


1244 


3030 


4816 


6602 


784CIP2B 9^4 


ft A 1 A 


1245 


3031 


4817 


**03 


784CIP7B Pit; 


8415 


1246 


3032 


4818 


6604 


7R4CTP7B 91fi 


8419 


1247 


3 033 


4819 


6605 


784r i TP?B 917 


8426 


1248 


3034 


4820 


*606 " 




8430 


1249 


3035 


4821 


6607 


784CIP2B 919 


8431 


1250 


3036 


4822 


6606 




6432 


1251 


3037 


4823 


*609 




8433 


1252 


3038 


4824 


6610 


7R4rTP?R 9fl7 


8434 


1253 


3039 


4 825" 


6611 


7fldrTD5tJ OA1 


843 8 


1254 


3040 


4826 


6612 




843 9 


1255 


3041 


4827 


6613 




8441 


1256 


3042 


4828 


6*14 


(OILIP^B 94o 


8450 


1257 ■ " 


3043 


4829 


6615 


TflAPTPOR QA-7 

f o^Lli'Zo 947 


8451 


1258 


3044 


4830 


6616 


/ B4V-iP2B_948 


B452 


1259 


3045 


4 831 


DOJL / 


/84CIP2B 949 


8460 


1260 * 


3046 


4832 


DOlO 


784CIP2B 950 


8461 


1261 


3047 


4833 


66 IS 


ro4L.XP2B 951 


8462 


12*2 


3048 


4834 


t>o«u 


784CIP2B 952 


8464 " 


1263 


3049 


4835 


OD£l 




8465 


1264 


3050 


4836 


6622 




8467 


1265 


3051 


4837 


6623 


/ 04L.1V2B 955 


8470 


1266 


3052 


4838 


6624 


/o4UXi J 2B_9b6 


8471 


1267 


3053 


4839 


6625 


'Ollir/D 9«> / 


8473 


126B 


3054 


4840 


6626 




8474 


1269 


3055 


4841 


6627 


/041-XJb'^n 9b9 


8475 


1270 


30** 


4842 


*628 


7flAPTD"3H Of n 


8476 


1271 


3057 


4843 


6629 


*7Q^r l TD'5Ci Q £ 1 


8480 


1272 


3058 


4844 


6630 


/D4C.XJt'2B 962 


8482 


1273 


3059 


4845 


6631 




3482 


1274 


3060 


4846 


6632 


•7ft Arrtaoti "ac^/i 


8486 


1275 * 


30*1 


4847 


6633 


TfliPT botj occ 
/o^Ulf^B 965 


8488 


1276 


3062 


4848 


6634 


~t Q A fT TSOQ Q«f£ 


8492 


1277 


3063 


4849 


6635 


/o4LiV2B 967 


8494 


1278 


3064 


4850 


**3* 


/ O ** V_ J. f rf£id ?bo 


o49b 


1279 


30*3 "' 


4051 


6637 




8497 I 


1280 . 


3066 


4852 


6638 




8499 


1281 


3067 


4853 


6639 


7fl4PTP7B 971 


8513 


1282 


3068 


4854 


6640 


784CIP2B Q79 


" QCOO 


1283 


3069 


4655 


6641 


7S4CIPS>B Q71 


a c5C 
a 3<so 


1284 


3070 


4856 


6642 


7B4PTP2R 97A 




1285 


3071 


4857 


6643 




8533 


1286 


3072 


4858 


6644 






1287 


3073 


4859 


6645 


7fl4PTP7R 977 


ob44 


1288 


3074 


4860 


6646 




8565 


1289 


3075 


4861 


6647 




8565 


1290 


3076 


4862 


6648 


'o4Llr^o 9oU 


8572 


1291 


30^7 


48*3 


6649 


/O^Llt'^o 901 


8576 


1292 


3078 


48*4 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


8*9$ " 


129S 


3081 


48*7 


6653 


784CIP2B_98S 


8602 


1296 


3082 


4868 


6654 


7B4CIP2B 986 


8604 


1297 


3083 


4869 


6655 


784CIP2BJ387 


8609 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


' 3085 


4871 " 


6657 


784CIP2B 989 


8*37 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 
si\j : ox 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 


5EQ ID 

NO: 

of contig 


Priority 
docket number_ 
corresponding 
ofc-U ID NO : in 
pno. i t. y 

appl i cat i on 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1300 


3086 


4 872 


6658 




864 0 


1301 ■ 


3087 


4873 


6659 


/alUXr^J} 331 


8643 


1302 


3088 


4874 


6660 




8645 


1303 


3089 


4875 


6661 




8650 


1304 


3090 


4876 


6662 


'OftUXr^D 334 


8651 


1305 


3091 


4B77 


6663 


TflAPTDIU QQC 


8654 


1306 


3092 


4878 


6664 




8655 


130V 


3093 


4879 


6665 




8657 


1308 


3094 


4880 


6666 




8665 


1309 


3095 


4881 


6667 


*7 RiPTDTn ci 0 a 
ZO*il_JL*'£D 333 


8668 


1310 


3096 


4882 


6668 


/o4i-±F2» 1U00 


8671 


1311 


" 3097 


4883 


£669 


/ofH-J-f^tS JLUUi. 


8672 


1312 


3098 


48B4 


6670 


1 &D 1U02 


8692 


1313 


3099 


4835 


"" 6671 


/04LIP23 1003 


8706 


1314 


3100 


4 886 


""(TfCn-? 

OO / £ 


/U4LIP23 1004 


8716 


1315 


3101 


4B87 





/o4^1P2B 1Q05 


8719 


1316 


3102 


4 888 


0 O f*t 


784CIP2B 1006 


8743 


1317 


3103 


4889 


OO /3 


784CIP2B 1007 


8764 


1318 


3104 


4890 


acne. 

DO / O 


784CIP2B 1008 


B7b J 4 


1319 


3105 


4 891 


&<i n 

O O / / 


7B4CIP2B 1009 


8764 


1320 


3106 


4892 


6678 


784CIP2B_1010 


8774 


1321 


3107 


4 893 


6679 


784CIP2B 1011 


8782 


1322 


3108 


4894 


O DO U 


784CIP2B 1012 


8796 


1 1323 


3109 




6681 


784CIP2B 1013 


8827 


1324 


^110 


4896* 


ccRo 

O 0 Oii 


784CIP2B 1014 


8842 


132$ 


3111 


48 97 


6683 


784C1P2B 1015 


8842 


j 1326 


3112 


4 898 


6684 


784CIP2B_1016 


8858 


! 1327 


3113 


4899 


boob 


784CIP2B_1017 


8B71 


I 1328 


3114 


4900 


6£>86* " 


784CIP2B 1018 


8921 


l"i29 


3115 


4 901 


6687 


784CIP2B 1019 


8927 


1330 


3116 


4902 


a a r a 
00 00 


784CIP2B 1020 


8942 


1331 


3117 


4903 


OOD9 


7B4C1P2B_1021 


8994 


1332 


3118 


4904 


OO jU 


784CIP2B 1022 


9023 


1333 


3119 


4905 


C C Q1 

0071 


784CIP2B 1023 


9028 


1334 


3120 


4906 


003« 


784CIP2B_1024 


9058 


1335 


3121 


4907 


0 03J 


784CIP2B 1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4909 


6* 6* 9 5 


/o4L±F2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B 1028 


9082 


1339 


3125 


4911 


OO? / 


/04t-IP2B__1029 


9084 


1340 


3126 


4912 


6698 


/o4L.lfc ) 2B 1030 


9093 


1341 


3127 


4913 


6 , 6 J 99'"' " 


'04V*lJr2B 1031 


9101 


1342 


3128 


4914 


O / UU 


•7 0/IPTT51D T A«*«"^ 

/U4C1P2B 1032 


9103 


1343 


3129 


4915 


6701 


/o4t-li'2B_1033 


9105 


1344 


3130 


4916 


6702 


fOHUir^o 1UJ4 


9151 


1345 


3131 


4917 " 


6*703 


/B4L.1F2B 1035 


9161 


1346 


3132 


4918 


6704 


/ 0 4 v. 1 *%2jd_1 0 J 0 


9172 


1347 


3133 


4919 


6705 


/04L.1PZB HJJ7 


9174 


1348 


3134 


4920 


6706 


/ 04 LI rr/. D lUJo 


9204 


1349 


3135 


4921 


6707 


/04L.Xir2B 11M3 


9234 


1350 [ 


3136 


4922 


6708 


T 0 VI f'TOtt 1 ft it ft 

/04L.1P2B 1040 


9235 


1351 


3137 


4923 


0 / u 3 


/04L.1P2B 1041 


9239 


1352 


3138 


4924 


0 / ill 


784CIP2B 1042 


9256 


1353 


3139 


4925 


6711 


7B4CIP2B 1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B 1044 


9345 


1355 


3141 


4927 


6713 


7B4C1P2B 1045 


9379 


1356 


3142 


4928 


6714 


7B4CIP2B 1046 


9435 


1357 


3143 


4929 


6715 


7B4diP2B 1047 


9437 


1358 


3144 


4930 


6716 


7B4CIP2B 1048 


9469 


" 1359 


3145 


4931 


6717 


784tXP2B 1049 


9500 


1360 


" 3146 


4932 


6718 | 784CIP2B 1050 


9502 


136 1 ! 


3147 


4933 


6719 J 784C1CP2B ld3l 


9520 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; of 
full- 
length 
peptide 
sequence 


SEQ ID NO : 
Di concig 

nurl ^rtt* ^ H f* 

IIUL J.CUU1UC 

sequence 


SEQ ID 

vn . 
ku : 

of contig 

£SCJk/ U X uc 


Priority 
docket number^ 
corresponding 

C CO TO KTO • i" rk 
rir i /ST"! fr w 

}js. iux x Ly 
appl i Celt ion 


SEQ ID 
NO: in 
U.S. S.N. 
03/ 488 , 725 


13*2 


3148 


4934 


6720 


784CIP2B 1052 


5*D^l X 


1363 


3145 


4935 


6721 ■ 


784CIP2B 1053 


73lJ. 


1364 


3150 


4936 


6722 


78.4CIP2R inRd 


oca a 


1365 


3151 


4937 


6723 


784CIP2B 1055 


9556 


1366 


3152 


4938 


6724 


784CIP2B 1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B 1057 


9""J7H 

73/3 


1368 


3154 


4940 


6726 


784CIP2B 1058 


?30? 


1369 


. 3155 


4941 


6727 


784CIP2B 1059 


9599 


1370 


3156 


4942 


| 6728 


784CIP2B 16*0 


0SO2 ' 


1371 


3157 


4943 


1 6729 


784CIP2B 1061 


7D VO 


1372 


3158 


4944 


6730 


784CIP2B 1062 




1373 


3159 


4945 


6731 " 


7fi4C r fP7R intfl 




1374 


31*0 


494* 


" 6 7" 3 2 


*7ftdPTD5H 1 f|Cd 
/ D4Uir<JD lUD'i 


9646 


1375 


3161 


4947 


6733 


'OnUlr^U IUOj 


974 7 


1376 


3162 


4948 


6734 


/o^uir^o lUbb 


9773 


1377 


3163 


4949 


67 35 




9785 


1378 


3164 


4950 


67^6 


lOVLlffD JLUbo 


9801 


1379 


3165 


4951 


6737 


7BAPTD"3n T ACQ 


9811 


13B0 


3166 


4952 


6738 




3*043 


1381 


3167 


'4953 


6739 


/B^Llf^o 1071 


9654 


L 1382 


3168 


4954 


6740 




9854 


1383 


3169 


4955 "' ~ 


*741 


fOHvlr£C JLU 


9864 


1384 


3170 


4956 


6742 




9864 


1385 


3171 


4957 


6743 


fO^LJ.fiH 10/D 


9871 


13 86 


3172 


4958"" 


6744 


*7D A r'TDDlJ 1 ftlC 
'O^tXrZD -LU/O 


9879 


1387 


3173 


4959 


674? 


/o4tiFzti 1077 


9881 


1388 


3174 


4960 


674 6 


*T O ii f** T la *> d t nio 


9885 


1389 


3175 


4961 


674 7 




9901 


1390 


3176 


4962 


674 8 


/a4LlP2B 1080 


9912 


1391 


3177 


4 9*3 


674 9 


/84LIP2B 1081 


9916 


1392 


3178 


4964 


6750 


fO^K^XCAO lUoz 


9921 


1393 


3179 


4965 


6751 


/o4C±P2B_1083 


9925 


1394 


3180 


4966 


6752 


TQ/lOTmn -i noA 
f o*i Lir^o 1Ud4 


9930 


1395 


3181 


4967 


6753 


/ H4 C_ J. P2rJ_l 0 8 5 


9949 


1396 


3182 


4968 


S7£4 — 


toattdoo 1 fiQ/" 
/o^Ulrtfo lUob 


9951 


1397 


3183 


4969 


6755 




9959 


1398 


3184 


4970 


6756 


7R4f"*T"i>7R 1 AAA 


9973 


1399 


3185 


4971 


67S7 


7fiflPT WOH moo 


9982 


1400 


3186 


4972 


6758 




9994 


1401 


318? 


4973 


6759 


7fl4CTP7P i nqi 


10021 


1402 


3188 


4974 


5760 




10041 


14 03 


3189 


4975 


6761 




10067 


1404 


3190 


4976 


6762 


784£lP2B 1095 


10073 


1405 


3191 


4977 


6763 


7H4CTP2B lflQif 


101*12 


1406 


3192 


4978 


6764 


784CIP7n 10Q7 


10117 


1407 


3193 


4979 


6765 




10132 


1408 


3194 


4980 i 


6766 


784CIP2B i n qq 


X\i JLb 3» 


1409 


3195 


4981 


6767 




XvZX I 


1410 


3196 


4982 


6768 


7B4CIP7B 1101 

* ui wi,r«iD ii 


1 (177 C 


1411 


3197 


4983 


6769 


784CIP2B 1105 


J.U Z J £ 


1412 


3198 


4984 


6770 


784CIP2B 110"? 

* w^vlJr*D H v J 


" 1 0717 


1413 


3199 


4985 


*77l 


784CIP2B 1104 


1 fl*? "7 Q 


1414 


3200 " 


4986 


6772 


7B4CTP2r' 1 


o3 


1415 


3201 


4987 


6773 




271 


1416 


3202 


4988 


6774 


784CIP2C 3 


848 


1417 


3203 


498$ 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C 5 


864 


1419 


3205 


4991 ! 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 


784CIP2C_8 


1595 


1422 


3208 


4994 


6780 


784CIP2C_9 


1697 


1423 


3209 


4995 


6781 


784CIP2C_10 


1744 
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SEO ID wn- 
of full- 
length 
nucleotide 
sequence 


CPA T n 

NO : of 
full- 
length 
peptide 
sequence 


C CT\ T T\ VTrt . 

ot contig 

nnrl pa)* *i rim 
nUCJLCOLlQC 

secruence 


SEQ ID 
of contig 
sequence 


Priority 
docket number^ 
corresponding 

fcs a. x x ^ y 

aDDlication 


SEQ ID 
NO: in 
U.S. S.N. 

no l A QQ 11C 

uy/ 4bb, /2b 


' 1424 


" " 3 210 - 


4996 


6782 


784CIP2C 11 


193 7 


1425 


3211 


4997 


" 6783 


784CIP2C 12 


1955 


1426 


3212 


4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


784CIP2C 14 


2185 


1426 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3215 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2 9b" 5 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 - 


" 6792 


784CIP2C 21 


2959 


1435 


3221 


5007 


6793 


784'£lP2C~22 — 


oogc 


1436 


^222 


5008 


6794 


784CI D 2C 23 


*70D 


1437 


3223 


5009 


6795 


784CI^>?r 74 


^.J /V 


1438 


3224 


5010 


6796 


7B4CTP2C 


7QOC 


1439 


3225 


5011 


6797 






1440 


3226 


5012 


6798 


7fl4C , TP?r' 57 

» O ^» L» JL irtf i„ ^/ 




1441 


3227 


5013 


6799 




2993 


1442 


! 3228 


5014 


6800 


784CTP2P 99 


JUJL f 


1443 


! 3229 


5015 


6801 




J046 


1444 


3230 


5016 


6802 






1445" 


3231 


5017 


6803 




3357 


1446 


! 3232 


5018 


6804 


784C , TP7P* n 




1447 


| 3233 


5019 


6805 


7R4PTTJOr» 


3432 


1443 


3234 


5020 


6806 


/ o v_ i. Jr ^ V- Jo 


3438 


1449 


3235 


5021 


6807 


'O'i^IrcL Jo 


343 9 


1450 


3236 


5022 


6808 


7R4PTP2H "XQ 


3463 


1451 


3237 


5023 


6 809 




3466 


1452 


3238 


5024 


6310 


/ OIL J. Jr^l* ^J. 


3466 


1453 


3239 


5025 


6 311 


/Oil. XJr^l. 4^ 


3467 


145*4 


3240 


5026 


6812 


/ OH.\* J. ir^i l_ 4 J 


3468 


1455 


3241 


5027 


6813 




3483 


1456 


3242 


5028 


6814 


7fldPTU7P A^ 


3484 


1457 


3243 


5029 


6815 


784r , TP5r 1 ' 


3488 


1458 


3244 


5030 


6816 


7RdP7D3P A*7 


3491 


1459 


3245 


5031 


6817 


/ O H V- X r & C **o 


3493 


1460 


3246 


5032 


6818 


7R4P , TP5r' 4Q 




1461 


3247 


5033 ' 








3.465 


5248 "" 


5034 


6820 


7ft4r , TPDr» 




1463 


3249 


5035 


6821 


7B4f , TP9r' ^2 




1464 


3250 


5036 


6822 


784CIP2T ^1 


3 503 


1465 


3251 


503/ 


6823 


784C¥"P~2C 


•> DU 1 


1466 


3252 


5038 


6824 


784CIP2C 55 




1467 


3253 


5039 


6825 


784CIP?r Rfi 




1468 


3254 


5040 


6826 


7B4CIP2C ^7 




1469 


3255 ■" 


5041 


6827 


784CIP2C £8 


m4«T" 


1470 


3256 


5042 


6828 


784CIP2C 59 


1 1^4 n 


1471 


3257 


5043 


6829 


784CIP2C 60 




1472 


3258 


5044 


6830 


784CIP2C 61 


3553 


1473 


3259 


5045 


6831 


784CIP2C 62 


3 56*4 


1474 


3260 


5046 


6832 


784CIP2C 63 


3 567 


■'" 1475 


3261 


5047 


6833 


784CIP2C £4 




1476 


3262 


5048 


6834 


784CIP2C 65 


j3/j 


1477 


3263 - 


5049 


6835 


7B4CIP2P fid 


J9/4 


1478 


3264 


5050 


6036 


784CIP2C 6"7 


3583 


1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3 266 


5052 


6838 


784CIP2C 69 


3623 


1481 


3267 


5053 


6839 


784CIP2CJ70 " 


3629 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 - 


" 5057 


6843 


784CIP2C_74 


" 3912 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO:in 


length 


full- 


nucleotide 


of contig 


cor re sponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




14 86 


3272 


5058 


6844 


784CIP2C_75 


3924 


14 87 


3273 


5059 


6845 


784CIP2C 76 


3928 


Hot) 


3274 


5060 


6846 


784CIP2C 77 


3935 




3275 


5061 


6847 


784CIP2C_78 


3959 




3276 


5062 


6848 


784CIP2C 79 


3981 


1 A. Q 1 


■JOT! 


E772t 

5063 


6849 


784CIP2C_80 


3989 




3273 


5064 


6650 


784CIP2C_81 


4295 


1493 


3279 


5065 


6851 


784CIP2C_82 


4300 


1494 


3260 


5066 


6852 


784CIP2C_83 


4360 


UQC 


3281 


5067 


6853 


784CIP2C_84 


4362 


1496 


3282 


5068 


6854 


784CIP2C_85 


4371 


14 97 


3283 


5069 


6855 


784CIP2C_86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C_89 


4378 


! 1500 


3286 


5072 


6858 


784CIP2C_90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


* 1502 


3288 


5074 


6860 


784CIP2C_92 


4421 


1S03 


3289 


5075 


6861 


784CIP2C 93 


4421 


1 H n , 

lb 04 


3290 


5076 


6862 


784CIP2C_94 


4426 


1505 


3291 


5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C_97 


4436 


1508 


3294 


5080 


6866 


784CIP2C 98 


4439 


1509 


3295 


5081 


6867 


784CIP2C 99 


4440 


1510 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6669 


784CIP2C 101 


4442 


1512 


3298 


5084 


6870 


784dtP2C 162 


4455 


1513 


3299 


5085 


6971 


784CIP2C_103 


4462 


1514 


3300 


5086 


6872 


784CIP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C 105 


4469 


1516 


3302 


5088 


6874 


784CIP2C 10£~ 


4471 


1517 


3303 


5089 


6B75 


784CIP2C 107 


4481 


1518 


3304 


5090 


6876 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


7B4CIP2C_109 


4484 


1520 


3306 


5092 


6378 


784CIP2C 110 


4486 


1521 


3307 


5093 


6879 


784CIP2CJL11 


4490 


152 2 


3308 


5094 


6880 


784CIP2C_112 


4499 


1523 


3309 


5095 


6861 


784CIP2C 113 


4503 


1524 


3310 


5096 


6882 


784CIP2C_114 


4506 


15*25 


3311 


5097 


6883 


784CIP2C 115 


4509 


1526 


3312 


5098 


6884 


784CIP2C_116 


4514 


1527 


3313 


5099 


6885 


784CIP2C_117 


4516 


1528 


3314 


5100 


6886 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


784CIP2C_119 


4525 


i cm 


3316 


5102 


6888 


784CIP2CJL20 


4527 




3317 


5103 


6889 


784CIP2CML21 


4528 


1 o 
194« 


3318 


5104 


6890 


784CIP2C 122 


4529 


1 Cli 
A3 J J 


331 9 


5105 


6891 


7B4CIP2C_123 


4532 


1534 




5106 


6892 


7B4CIP2C 124 


4537 




J J21 


5107 


6893 


784CTP2C_125 


4538 


JLjjQ 


3322 


5108 


6894 


784CIP2C_126 


4551 


13J / 


3323 


5109 


6895 


784CIP2C_127 


4552 


KID 


3324 


5110 


6896 


784CIP2C 128 


4559 


1 c*a q 

is j y 


3325 


5111 


6897 


784CIP2C 129 


4567 


1540 


3326 


•J -L±.£> 


cn On 

D070 


784CIP2C 130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1542 


" 3328 


5114 


6900 


784CIP2C_133 


4592 


154 3 


3329 


5115 


6901 


784CIP2C 134 


4609 


1544 


3330 


5116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4£l7 


1546 


' 3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


5119 


6905 


784CIP2C_138 | 


4620 ( 
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SEQ ID NO: 
of full- 
length 
nucleot ids 


SEQ ID 
NO: of 
full- 
length 
pepciae 
sequence 


SSQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1548 


OJJ 1 . 


DliU 


6906 


784CIP2C 139 


4624 


1549 




5121 


6907 


7B4CXP2C 140 


4632 


1550 " 


3336 


5122 


6908 


784CIP2C_141 


4634 


1551 


3337 


91&J 


6909 


7o4CIP2C_142 


4638 


1552 


JJjO 


5124 


6910 


784CIP2C 143 


4639 


1553 


333 9 


3l*£ 3 


6911 


784CIP2C 144 


4643 


— 1554 




5126 


6912 


784CIP2C 145 


4644 


1555 




5127 


6913 


784CIP2C 146 


4655 


■L930 


3342 


5128 


6914 


7B4CIP2C_147 


4668 


j.jd / 


~ 


5129 


6915 


784CIP2C 148 


4.377 


1558 


J J 1 ** 


5130 


6916 


784CIP2C_149 


4677 




"it AC 


• 5131 


6917 


784CIP2C_150 


4677 


1560 




5132 


6918 


784CIP2C_152 


4682 


1561 


JjI / 


5133 


6919 


784CIP2C 153 


| 4690 


1562 


33 4 8 


9194 


6920 


784CIP2C_154 


4691 


1563 


J J *t -7 


5135 


6921 


784CIP2C 155 


4727 


1564 

4,0 0*4 




5136 


6922 


784CIP2C 156 


4730 




JJ9l 


5137 


6923 


784CIP2C_157 


4734 


1566 


JJ9/ 


5138 


6924 


784CIP2C 158 


4757 


1j d / 


7T£~5 


5139 


6925 


784CIP2C 159 


4764 


1300 


3354 


5140 


6926 


784CIP2C_160 


4786 


1569 


3355 


5141 


6927 


784CIP2C 161 


4793 


1570 


3356 


5142 


6928 


784CIP2C 162 * 


4825 


19 /X 


3357 


S143 


6929 


784CIP2C 163 


4826 


13 / £. 


3358 


5144 


6930 


784CIP2C__164 


| 4850 


1573 


3359 


5145 


6931 


784CIP2CJ.65 


4853 


1574 


3360 


5146 


6 932 


784CIP2C 166 


4855 


19 /9 


3361 


5147 


6933 


784CIP2C 167 


4856 


ID /o 


3362 


5148 


6934 


784CIP2C_168 


4867 


19 / f 


3363 


5149 


6935 


784CIP2C_169 


4869 


19 tO 


3364 


5150 


6936 


784CIP2C_170 


4878 


1^79 " " 


3365 


5151 


6937 


784CIP2C_171 


4880 


1580 


jjdo 


5152 


6938 


784CIP2CJ.72 


4942 


1581 


3367 


5153 


6939 


784CIP2C 173 


4945 


1582 


33 68 


5154 


6340 


784CIP2C_174 


4950 


1909 


33 f>9 


S155 


'6941 


784CIP2C 175- 


4952 


1584 


inn 


5156 


6942 


784CIP2C 176 


4954 


J19D9 


33 71 


5157 


6943 


784CIP2C 177 


4958 


190D 


3372 


5158 


6944 


784CIP2C 178 


4961 


1587 


3373 


5159 


6945 


784CIP2C__179 


5590 


lS88 


*a "3 7 A 


c i c a 


6946 


784CIP2C 180 


5599 


1589 


Jo /3 


5161 


6947 


784CIP2C 181 


5692 


1590 


JJ /© 


5162 


6948 


784CIP2C 182 


5732 


1591 


33 77 


5163 


6949 


784CrP2C 183 


5765 


1592 


3378 


t i a a 
bXo4 


6950 


784CIP2C 184 


5771 


1593 


33 79 


9 X©9 


6951 


784CIP2C 18S 


5774 


1594 


33 80 


9 lo b 


6952 


784CIP2C 186 


5793 


1595 


3381 


910 / 


6953 


7 8 4 CI P2C_1 8 7 


5806 


1596 


3382 


9 XC O 


d o ti a " 


784CIP2C 188 


5852 


1597 


3383 


ci en 

9 xoy 


6955 


784CIP2C_189 


5892 


1598 


3384 


5170 


6956 


784CIP2C 190 


6057 


1599 


J909 


5171 


6957 


784CIP2C_191 


6061 


1600 


3386 


^172 


6958 


784C1P2C_192 


6109 


1601 


11Q7 


5173 


6959 


784CIP2C 193 


6160 


1602 


3368 


5174 


6960 




6297 


1603 


3389 


5175 


6961 


784CIP2C 195 


6398 


1604 


3390 


5176 


6962 


784CIP2C 196 


6398 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C_198 


6448 


1607 


3393 


5179 


6965 


784CIP2C 199 


6469 


1608 


3394 


5180 


6966 


784CIP2C 200 


6*476 


1609 


3395 


5181 


6967 


784CIP2C 201 


6561 



296 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1610 


i 1 7 oc 




6968 


784CIP2C_202 


6574 


1611 


7 7 <J*7 


5183 


69^9 


784CIP2C 203 


6578 j 


" " 1612 


•3 J JO 


ci OA 
9104 


6970 


784CIP2C 204 


6662 


16"'l3 




3109 


6971 


784CIP2C_2 05 


6672 




3400 


5186 


6972 


784CIP2C_2 06 


6691 


XOiJ 


J4U1 


5187 


6973 


784CIP2C 207 


6695 






5188 


6974 


784CIP2C 208 


6746 


1Q1 / 




5189 


6975 


784CIP2C 209 


6898 




3404 


5190 


6976 


784CIP2C_210 


6938 


Xo 1 J 


3405 


5191 


6977 


7B4CIP2C 211 


6943 


lb ZU 


3406 


5192 


6978 


784CI?2C_212 


7110 


1621 


3407 


5193 


6979 


784CTP2C_213 


7200 


1622 


3408 


5194 


6980 


784CI?2C_214 


7212 


i a o t 


3409 


5195 


6981 


784CIP2C 215 


7218 




3410 


cTo^ 

5196 


69B2 


784CIP2C 216 


7249 




3411 


5197 


6983 


784CIP2C_217 


7500 




j 412 


5198 


6984 


784CIP2C 218 


7509 


Ibz / 


3413 


5199 


6985 


784CIP2C 219 


7523 




lAtA 

3414 


5200 


6986 


784CIP2C 220 


7544 




3415 


5201 


6987 


784CIP2C 221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 


1632 


3418 


5204 


6990 


784CIP2C_224 


7813 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_22B 


7943 ] 


1637 


3423 


5209 


6995 


784CIP2C 229 


8175 


163 8 


3424 


5210 


6996 


784CIP2C 230 


8216 




3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


7B4CIP2C_232 


82 71 


1641 


3427 


5213 


6999 


784CIP2C_233 


8397 


TZF5 

JL&4Z 


3 428 


5214 


7000 


784CIP2C 234 


8466 


1 £41 
JLbft J 


3429 


5215 


7001 


784CIP2C 235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2C 237 


9106 


T"2 r 2"c 


3432 


5218 


7004 


784CIP2C_238 


9139 


1 CA H 


3433 


S219 


7005 


784CIP2C_239 


9555 


1 CA A 


3434 


5220 


7006 


784CIP2C_24 0 


9650 


1643 


3435 


5221 


7007 


784CIP2C_241 


9889 


1650 


i Aid. ~ 


5222 


7008 


784CIP2C 242 


9933 


1651 


1 All 


5223 


7009 


7B4CIP2C 243 


9953 


1652 


"l A 1 a 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784CIP2D 1 


746 


1654 


3440 


5226 


7012 


784CIP2D 2 


3558 


1655 


3441 


5227 


7013 


784CIP2D 3 


3559 


1656 


1AA*} 


5228 


7014 


784CIP2D 4 


3633 


1657 


J **4 J 


5229 


7015 


784CIP2D 5 


3658 


1658 


O *«4«* 


coin " 


7016 


784CIP2D_6 


3732 


1659 




5231 


7017 


784CIP2D 7 


4004 


1660 


J445 


5232 


7018 


784CIP2D_8 


4700 


1661 


3 447 


5233 


7019 


7B4CIP2D 9 


4703 


1662 


*3 A A Q 
J449 


5234 


7020 


784CIP2D__10 


4774 


1 ceo" " ' 


3449 


5235 


7021 


784CIP2D 11 


4894 


1664 


3450 


5236 


7022 


TPAPToon no 
/o4l.xhMU I46 


4918 


1665 


3451 


5237 


7023 


784CIP2D_13 


5159 


1666 


3452 


5238 


7024 


784CIP2D__14 


7443 


1667 


3453 


5239 


7025 


784CIP2D_15 


8673 


1666 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784CIP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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OCtSi xu wo : 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full - 
length 
neot ide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 

cpni lpn f~»0 


SEQ ID 
NO: 

of contig 


Priority- 
docket number_ 
corresponding 

CCO T T\ VlO . { n 

xu ri\J I In 
annl it?»*- inn 


SEQ ID 
NO: in 
U.S. S.N. 
09/488 , 725 


1672 ■ 


34 ST 


5244 


" "7630 


784CTP2D 20 


Bfilfl 
ooxo 


| 1673 


3459 


5245 


7031 


784CIP2D 2T. 




1674 


3460 


5246 


7032 


784CIP2D 22 


- aa4 6 


1675 


3461 


5247 


7033 


784CIP2D 23 


8912 


1676 


3462 


5248 


7034 


784CTP2D 24 


0 JlO 


1677 


3463 


5249 


7035 


7R4CTP2D 2^ 


o ai a 


1678 


3464 


5250 


7036 




0 3%X 


1679 


3465 


5251 


7037 




a bA i 
os%x 


1660 


3466 


5252 


7038 




8951 


1681 


3467 


5253 


7039 


' o*±uji tr4u 4,y 


8951 


* 1682 


3468 


5254 


7040 


> O IV^i. e4Lf jy 


"ofTan 

9007 


1683 


3469 


5255 


7041 




i)U14 


1684 


3470 


5256 


7042 




9013 


1685 


3471 


5257 


7043 


7ft4r , TD7T» 11 


9025 


1686 


3472 


5256 


7044 


7 Ail P'TUTT'l Id 


9053 


1687 


3473 


5259 


" 7Q4S " 1 


/ OH^J-fZU J3 


9054 


1688 


3474 


5260 


\ 7046 




9054 


1689 


3475 


5261 






9113 


1690 


3476 


5262 


1(\A ft. 


f a *i t- XV 4. D_3 8 


9134 


1691 


3477 


040J 


7049 


7B4CIP2D_39 


9152 


1692 


3478 




7050 


784CIP2D 40 


9152 


1693 


3479 


5265 


7051 


7B4CIP2D 41 


9211 


1694 


34 80 




7052 


784CIP2D_42 


9223 


1695 


3481 


D4,a / 


7053 


784CIP2D 43 


9223 


1696 


3482 


5268 


" "Tnt:/ 
/UdI 


'84CIF2D 44 


9231 


1697 


3483 


5269 


7055 


784CIP2D 45 


9236 


1698 


3484 


5270 


7056 


'84CIP2D_46 


9236 


1699 


3485 


5271 


7057 


784CIF2D 47 


9303 


1700 ' 


3486 


COT 1 ) 


7058 


/84CIP2D__48 


9309 


1701 


3487 


5273 




/04LXP2D 49 


9314 


1702 ~" 


3488 


5274 


7060 


/o4(JlP2D 50 


9326 


1703 


3489 


5275 


7061 


/041.XP2JJ 51 


9339 


1704 


3490 


5276 


7062 


/O^Uir'ZjJ D4, 


9348 


1705 


3491 


5277 




/o4C_IPzD 53 


9376 


1706 


3492 


£278 




/84CIP2D 54 


9382 


1707 


3493 


5279 


"7P.Cc: 


/04CIP2D 55 


94 07 


1708 


3494 


5280 


/WOO 


/o4CIr/D 56 


9414 


1709 


3495 


5281 


1 UO / 


/o4(»XP2D__57 


9439 


1710 


3496* 


5282 


*7PiC ft 


/B4 58 


9485 


1711 


3497 


5283 


7PjCQ 


/ O i kK*Xtr4.U 3? 


94 93 


1712 


3498 


5284 


/ v /U 


/o^CXP^D^oO 


9501 


1713 


3499 


5265 


7071 




9526 


1714 


3500 


5286 


7072 


/o4CiP*£l/ o2 


9526 


1715 


3501 


5287 


7073 




9551 


1716 


3502 


5288 


7074 




9557 


1717 - ' 


3503 


5289 


7075 




9568 


1718 


3504 


S290" " ' 


7076 


t 't}%\m±Jf *u bb 


9588 


1719 


3505 


5291 


7077 


f Ot\.X~ 4U O / 


9597 


1720 


3506 


5292 


7078 


/ O <± V« 1 ,f <i U DO 


9615 


1721 


3507 


5293 


7079 


704PTD">n CQ 
/ O *± V- JL Jr 4 Li D 5 


QC*3Q 
JK>4a 


1722 


3508 


5294 


7080 




9649 


1723 


3509 


5295 


7081 


f OH\^Xtr4,U 1 X 


9652 


1724 


3510 


5296 


7082 




9660 


1725 


. 3511 " ' 


52Q7 


/ UOJ 


/D4L.IP2D 73 


9662 


1726 


3512 


5298 


7084 


784CIP2D 74 


9725 


1727 


3513 ■ " 


5299 


7085 


784CIP2D 75 


9746 


"~ 1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


70B7 


784CIP2D 77 


9787 


1730 


3516 


5302 


7088 


784CIP2D_7B 


9790 


1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


784CIP2D B0 


9842 


1733 


3519 


5305 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEO tD 
NO: oE 
full- 
length 
peptide 
sequence 


SEO ID NO ■ 
of contig 
nucleotide 
sequence 


NO: 

of contig 

peptide 

sequence 


Priority 
docket number 

correannnHI nrr 
i. c o puiiuiiiy 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO: iR 
U . S . S . N , 

OQ/dflfl lO? 
U7/ vDQ, / J. z> 


1734 


3520 


5306 


7092 


784CIP2D 82 


9867 


173* 


3521 


5307 


7093 


784CIP2D 83 


, 10010 


1736 


3S22 


5308 


| 7094 


784CIP2D_84 


10011 


1737 


3523 


5309 


I 7095 


784C1P2D 85 


10052 


1738 


3524 


5310 


7096 


784CIP2D 86 


10057 


1739 


3525 


5311 


7097 


784CIP2D 87 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


7099 


784CIP2b 90 


I0i4i 


1742 


3528 


5314 


7106 


! 784CIP2D 92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


10273 


1746 


3532 


53i8 


7104 


784CIP2E 1 


3121 


1747 


3533 


5319 


7105 


784CIP2E 2 


3628 


1748 


3534 


5320 


7106 


784CIP2E 4 


3673 


1749 


3535 


5321 


7107 


784CIP2E 5 


40i8 


1750 


3536 


5322 


7108 


784CIP2E 4 


4467 


1751 


3537 


5323 


7109 


784CIP2E 7 


~"4 865" 


1752 


3538 


5324 


7110 


784CIP2E 8 




1753 


3539 


5325 


7111 


'■ 784CIP2E 9 


4923 


1754 


3540 


5326 


7112 


784CIP2E 1 0 


A QO C 


1755 


3541 


5327 


7113 


784CIP2P 11 

" OT\-ir x x 


4962 


1756 


3542 


5328 


7114 


784CIP2E 1? 




1757 


3543 


[ 5329 


7115 


1 784CTP2E n 


A OCA ~ 


1758 


3544 


5330 


7116 


784CIP2E 14~ '■" 


4988 


1759 


3545 


5331 1 


7117 


784CIP2R \K 


5835 


1760 


iJ546- 


5332 


7118 




7682 


1761 


3547 


5333 


7119 


784CIP2F 17 




1762 


3548 


5334 


7120 


784CIP2E 18 




1763 


3549 


5335 


7121 


784dl£2E 19 


7707 


1764 


3450 




7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 


7752 


1766 


3552 


5338 


7124 


784CIP2E 22 


8357 


1767 


3553 


5339 


7125 


784CIP2E 23 


9065 


1768 


3554 


5340 


7126 


784CIP2E 24 


9324 


1769 


3555 


5341 


7127 


784CIP2F 1 


£y fK> 


1770 


3556 


5342 


7128 


7B4CIP2F 2 




1771 


3557 


5343 


7129 


784CIP2P 3" 


4021 


1772 


3558 


5344 


7130 


784CIP2F 4 


44*74 


1773 


3559 — 


5345 


7131 


784CIP2F 5 


4566 


1774 


3560 


5346 


7132 


784CIP2F 6 


4705 


1775 


3561 


534 7 


7133 


784CIP2F 7 


4 707 


1776 


3562 


5348 


7134 


784CIP2F 8" 


4712 


1777 


3563 1 


5349 


7135 


784CIP2F 9 


5006 


1778 


3564 


5350 


7136 


7B4CIP2F 10 


5009 


1779 


3565 


5351 


7137 


7B4CIP2F 11 


5015 


1780 


3566 


5352 


7138 


7B4CIP2? 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F 13 " 


7724 


1782 


3568 


5354 


7140 


784CIP2F 14 


7725 


1783 


3569 


5355 


7141 


784CIP2F_15 


8828 


1784 


3570 


5356 i 


7142 


784CIP2F_16 


8830 


1785 


3571 


5357 


7143 


784CIP2F 17 


9739 


1786 


3572 1 - 


5358 


7144 


784CIP2F 18 


9896 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid ssamsnt" Prtnt*AS nincr Qirrn^l n^nhSdo 
(AsAlanine, C=Cysteine, DoAspartic Acid, B= 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Lcucine; M -Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSALILDEVAILPAPQNLSVLSTNMKHLLMWSPVIAPG 
ETVYYS VEYQGE YBSLYTSHIWI PS SWCSLTEGP3CDVTDDITA 
TVPYNLRVRATLGSQTS/CLEHP/VSIPLIETQPSLPDL/RMEI 
TKDGFHLVIELEDLGPQFEFLVAYWRR3PGAEEHVKMVRSGGIP 
VHLETMEPGAAYCVKAQTFVKAIGRYSAFSQTECVEVQGEAIPL 
VLALFAFVGFMLILVVVPLFVWKMGRLLQ/YLLLPRGGSSQTPW 
KITQF 


5360 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTIiSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTT S FQCKVR SDVKP V IQWL KR VE YGAEGRHNS TI D VGGQKF 
mPTGDVWSRPDGSYLNKLLITRARQDDAGMYICLCSANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIG1PAGAVF1L 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALS AG PG VGLCEEHG S PAAPQHLLG PG P VAGP KLYP KLYTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGS I S SAN I LLD DQFQ P KLTDFAMAH ?RSHLEHQ SCTINMTS S 
SSKELWYMPEEYIRQGKLSIKTDVYSFGIVIMEVLTGCRWLDn 
P KH I QLRDLLREIMEKRGLDSCLSFIiDKKVP PCPRNFSAKLFCL 
AGRCAATRAKLRPSMDBVLNTLESTQAS L YFAE DP PTSLKS FRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
S QSE VM FLS LD KKPES KRNEEACNM P S S S CE ES W F P ICY I VPS QD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


5362 


2 


4879 


3 CQ VEGCTRT YNSS QS IG KHM KTAHP DQ YAA FKMQRKS KKGQXA 
NNLKTPNNGKFVYFLPSPVNSSNPFFTSQTKANGNPACSAQLQH 
VSPPIFPAHLASVSTPLLSSMESVJNPNITSQDKN3QGGMLCSQ 
MENLPSTAliPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPS PADS GTNS VFS QLENNTNH YS S Q I EGNTNS S FLKGGNG ENA 
VFPSQVNVANKFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHMK 
RAKWPAI IRDGKFX CSRCYRAFTNPRSLGGHLS KRS YCKPLDGA 
EIAQELLQSNGQPSLIiASMILSTNAVNLQOPQQSTFNPEACFKD 
PSFLQLLAENRSPAFLPNTFPRSGVTNFNTS VSQEGSE 1 1 IQAL 
ETAGI PSTFEGAEMLSHVSTGCVSDASQVNATVM PNPTVPPLLH 
TVCHPNTLLTNQNRTSNS KTSSI EECSSLPVFPTNDIXLKTVEN 
GLCSSSFPNSGGPSQNFTSNSSRVSVISGPQNTRSSHLNKKGNS 
AS KRRKKVAPPL I APNASQNLVTSDLTTMGLI AKS VE I PTTNLH 
SNVIPTCEPQSLVENLTQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDS QMMALNS CTTS VNSDLQ I S E DNV I QNFE KT 
LEIIKTAMNSQILEVKSGSQGAGETSQNAQINYNIQLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGL 
QKLKLENDLSTPASQCVLINTSVTLTPTPVKSTADI TVI QP VSE 
MINI Q FNDKVNK P FVCQNQGCN YSAMTKDAL FKHYG K I HQYTP E 
MILEIKKNQLKFAPFKCWPTCTKTFTRNSNLRAHCQLVHHFTT 
EEMVKLKI KRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHSNVAVIPEKQLIEKKSPDKTESSLQVITVTS 
EQC^rrNAlfTNTQTKGRKIRRHKKBKEEKKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDLPAFSAEVEE 
ESEAGKES EETETKQTLKEFRCQVSDCSRI FQAI TGLIQHYMKL 
HEMTPEEI ES MTASVDVGK PPCDQLECKSS FTTYLNYWHLEAD 
HGIGLRAS KTEEDGVYKCDCEGCDR I YATRSNLIiRHI FNKHNDK 
HKAHL I R PRR LT PG Q ENMS S KAN QE KS KS KHRGTKHS RCGKEG I 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALS ECTS RF VTQ YP CM I KGCTS WTSESNI I RH YKCH KLS KAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
WSRTTAT VSQKE VEKNE* DEMDELTELFITKL INEDSTS VETQA 
NTSS NVSNDFQE DNL CQ S ERQ KAS NL KR VNKE KNVS QNKKRKVE 
KAE PASAAELSSVRKEEETAVAIQTI EEHPAS FDWSSFKPMGFE 
VSFLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKN1DKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M^Methionine, N=*Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S*Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrcsine, X=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide inserbion) 








VLKQLQEMKPTVSLKKLEVHSNDPDMSVMXDI S IGKATGRGQ Y 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRXRREANLVATCLPVRASLPHRLNML 
RGPG PGLLLLAVLCLGTAVPSTGAS kskrqaqqm vqpqs p vavs 
QS K PG CYDNGKHYQ INQQWERTYLGNALVC TC YGGSRG FNCE S K 
PEAEETC FDKYTGNTYR VGDT YE R PKDSMI WDCTC IGAGRGR X S 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGXGEWT 
CKP I AEKCFDHAAGTS YWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTS RNRCNDQDTRTS YRIGDTWSKKDNRGNLLQC I CTGNGRG 
EWKCERHTS VQ TTSSG SG PFTD VRAAVYQ PQ PH PQ PPP YGHCVT 
DSGWYSVGMQLA* KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYECDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEE I CTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DT FHKRHE EGHMLNCTC FGQGRG RWKCDP VD Q CQDS ETGT FYQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SG FR VE YE LS E EGDE PQ YLVLPSTATSV\ N I P\ DLLPGRK YI VN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAP I TG YRI V YSP SVEG SSTELN L P ETANS VTLSDLQPG VQ YN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 
TGLS PGVTYY FKVFAVSHGRES KPLTAQQTTKlADAPTNLQF VN 
ETDST VL VRW TPPRAQ I TG YRLTVGLTRRGQ PRQYNVG PS VS KY 
PLRNLQPAS EYTVS LVAI KGNQES PKATGVFTTLQPGS SIPPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERLAP\IVNK\WTPLSPPTNLH 
LEANPDTGVIjTVSWERSTTPDITGYRITTTPTNGQQGNSIjEEVV 
HADQSSCTF \DNLEVPGLS YNVS VYTVKDDKES VPI SDT 1 1 PAV 
P PPTDLRFTN/ ILGPDTMRVTW \APP PSI DLTNFLVRYSP VKNE 
GRMLQSLS 1 FFLSDN\AWLTNLLPGTEYWS VSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\D1TA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALNGREES PLLIGQQSTVSDVPRDLEWAATPTSLLI \ SWD 
APAVTVRY YR I TYGETGGNSPVQE FTVPGSKS TATI SGLKPGVD 
YT I TVYAVTGRGDS PAS S KP I S INYRTE I D KPSQMQVTD VQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTiCTAGPDQTEMTI 
EGLQPTVE YWS VYAQNPSGES QPLVQTAVTNI DRP KGLAFTD V 
DVDS I KI AWES PQGQVSRYRVTYS SPEDG IHELFPAPDGEEDTA 
ELOGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTP TSLS AQWTPPNVQLTGYRVRVT PKEKTGPMKE INLAPDS S 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETT T TI S WRTKTETI TG FQ VDAVP ANGQT P I QRTI KP 
DVRSYTITGLQPGTDYKrYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNS LLVSWQP PRARI TGYI IKYEKPGS PPRE WPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQ P S VGQQM I FEEHG FRRTTP PTTATP I RHRPRP YP PNVGQE 
ALSQTT IS WAP FQDTS E Y I ISCHP VGTDE E PLQFR VPGTS TSAT 
LTGLTRGATYN 1 1 VEALKDQQRHKVREEVVTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVWYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCl'CFGGQRGWRCDNCR 
RPGGE PS P EGTTGQS YNQYSQRYHQRTNTNVNC P I E C FM PLDVQ 
ADREDSRE 


5364 


8066 


703 


RLCCTGGGEGT PGASG KRGPAATTS LVLC I PS VPPP VP FPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRX,NML 
RGPGPGLLLIAVLCLQTAVPSTGASKS KRQAQQM VQ PQS P VAVS 
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SEQ 
ID 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, FsPhenyl alanine, G=Glycine, 
H=Histidine, Ielsoleucine, Xt=Lysine, 
L=Leucine, M=Methionine, N**Asparagine , 
P»Proline, Q=Glutamine, R«Arginine, 
S«Serine, T»Threonine, v^Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSKPGCYDNGKHYQINOQWERTYLGNALVCTCyGGSRtiFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMIiECVCLGNGKGEKT 
CRPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWS KKDNRGNLLQC ICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA * KTQGNKQMlA CTCLGNGVS CQETAVTQTYG 
GNSNGE PCVLP FTYNGRTP YSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSMGALCHF PFLYNNHNYTDCTSEGRR 
DNMKWOGTTQN YDADQKFG FCPMAAH3E I CTTNEGVM YR IG DQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQLRDQCI VDD I TYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWSKYVHGVRYQCYCYGRGIGEWHCaPLQTYPSSSGPVEVFI 
TET PSQPNSHP I QWN APQ P3H I S KY I LRW R P KNS VGRW KEATI P 
GHLNS YT I KGLX PGWYEGQLIS I QQ YGHQE VTR FDFTTTS TST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDBPQYLVLPSTATSV\NIP\DI#LPGRKYIVN 
VYQ I S EDGEQS L I LSTSQTTAPDAP PD PT VDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDljQFVEVTDV 
KVTIMWTP PESAVTGYR VDVI PVNIjPGEHGQRLPLSRNT F\ AEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNI>QFVN 
ETDSTVLVRWTPPRAQ I TG YRLTVGLTRRGQPRQYNVGPSVS KY 
PLRNI*QPASEYTVSLVAIKGNQESPKATGVFTTLQPGSSIPPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\lVNK\WTPIiSPPTNLH 
LEAN P DTGVLTVS WE RS TT PDI TGYR I TTTPTNGQQGNS L EE W 
HADQSSCTF\DNLEVPGLEYNVS VYTVKDDKESVP ISDT 1 1 PAV 
PPPTDLRFTN / 1 LGPDTMRVTW\APPPS IDLTNFLVRYSP VKNE 
GRMLQSLS I FFLS DN\A WLTNLLPGTSYWS VSS VYEQHES TP 

\lrgrqktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 
tgyrirxhhpehfSsgrpredrwphsrnsitltwltpgteyw 

SI VALNGREES PLLIGQQSTVSDVPRDLEWAATPTSIjLI \ SWD 
APAVT VR YYR I TYGETGGNS PVQEFTVPGS KSTATISGLKPGVD 
YTITVYAVTGRGDSPASSKPISINYRTEIDKPSQMQVTDVQDNS 

isvkwlpsss pvtgyrvttt\pkngpg\ PTKTKTAGPDQTEMTI 

EGLQ PTVE YWS VYAQN P SGESQPLVQTAVTN I DRPKGJbAFTD V 
DVDS I KIAWES PQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGrQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SWVSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSP PRR 
ARVTDATETT I T I S WRTKTET I TGFQ VDAVP ANGQT P I QRTI KP 
DVR S YT ITGLQ PGTDYKI YL YTLNDNARS S P W I DAS TAI DAPS 
NLRFLATTPNS LLVS WQ P P RAR I TGY 1 1 KYE KPGS P PRE W P R P 
RPGVTEATITGLE PGTEYT I YVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPE I LDVPSTVQKTP FVTHPGYDTGNGIQLPGT 
SGQQPS VGQQM I FEEHGFRRTTP PTTATP I RHRPRP YPPNVGQE 
ALSQTT I SWAP FQDTSEY IISCH P VGTDE E PLQFR VPGTSTS AT 
LTGLTRGATYN 1 1 VEALKDQQRHKVRBE WTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGC2MMSCTCLGNGKGEFKCDP 
HEATC YDDGKT Y H VGEQWQ KE YLG AI CSCTC FGGQRG WRCDN CR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRE 


■§3^5" 


BOH 


703 


RLCCTGG GEGT PG ASGKRG P AATTSLVLC I PS VP P P VP FPTLW P 
PPSMRRQPPGGiRRDFSRRLRREANLVATCLPVRASLPHRLNMIi 
RGPG PGLLLLAVLCLGTAVPS TGASKS KRQAQQMVQPQS PVAVS 
QSKPGCYDNGKHYQINQQWBRTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTT ANRCHEGGQS YK IGDTWRRPHETGG YMLECVCLGWGKGE WT 
CKPIAE KC FDHAAGTS YWG ETWEKP YQG WMMVDCTCLGEGSGR 
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SEQ 
ID 

NO: 


"Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R~Arginine, 
S=Serine, ^Threonine, V-Valine, 
W«Tryptophan, Y»Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ITCTSRNRCNDQDTRTS YRIGDTWSiCKJDNRGNLLQCi CTX3NGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML\CTCLGNGVSCC.ETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYBQDQ 
KYS FCTDHT VL VQTRGGNSNGALCHFPFL YNNHCfYTDCTS EGRR 
DNMKWCJGTTQN YDADQKFGFCPMAAHEE ICTTNEGVM YR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQLRDQCI VDDI TYNVN 
DTFHKRHEEGHMLNCTC FGQGRG RW KCD P VDQCQDS ETGTF YQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQMNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GKLNSYTI KGL KPGWY EG QL I S I QQYGHQE VTR FDFTTTSTS T 
PVTSNT\VTGETTPFSPLVATSES VTEITASS F WSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQIS EDGEQSL I liSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQ AP I TGYR I VYS PS VEGS STE LNL P ETANS VTL 5DLQ PGVQYN 
ITI YAVEENQESTPWIQQETTGTPRSDTVPS PRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTFVAEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRIU3QPRQYNVGPSVSKY 
PLRNLQPASEYTVSLVAI KOXQES PKATG VFTTLQPGS S IPPYN 
TEVTETTIVITWTPAPRIGFFCLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 
HADQS SCT F\ DNLF.VPGLEYNVS VYTVKDDKES VPI SDT 1 1 PAV 
PPPTDLRFTN/ 1 LGPDTMR VTW\AP PP S IDLTNFLVRYS P VKNE 
GRMLQSLS I FFLS DN\AWLTNLLPGT3 YWS VSS VYEQHESTP 
\LRGRQKTGLDSP \TGIDFS \ DITA\NS FT \VHW\ IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNIiTPGTEYVA^ 
SrVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLIil\SWD 
APAVTVRYYRI TYGETGGNSP VQEFTVPGSKSTATI SGLKPGVD 
YTI TVYAVTGRGDSPASS KPI SINYRTE IDK PSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQ PTVE YWS VYAQNP SGE SQ PL VQTAVTN IDR P KGLAFTD V 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTArPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S VWSGLMVAT KYEVS VYALKDTLTSRPAQGWTTLENVS PPRR 
ARVTDATETT I TI S WRTKTET I TGFQVDAVP ANGQTP IQRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGY1IKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPErLDVPSrVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPS VGQQM I FEEHGFRRTTPPTTATP I RHRPRPYPPNVGQE 
ALSQTT I S WAPFQDTSE Y 1 1 S CHP VGTDEE PLQFR VPGTSTS AT 
LTGLTRGAT YN 1 1 VEALKDQQRHKVREE WTVGNS VNEGLNQPT 
DDSCFDPYTVSKYAVGDEWERMSESGFKLIiCQCLGFGSGHFRCD 
S SRWCHDNG VN Y KIGE KWDRQG ENGQM MS CTCLGNGKGE FKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS P EGTTGQS YNQYS Q R YHQRTNTNVNC P I ECFMPLD VQ 
ADREDSRE 




8066 


703 


RirCCTGGGEGTPGASGKRGPAATTSIiVLCIPSVPPPVPFPTLWP 
P P S WRRQP PGG I RRDFS RRLR R E ANLVATCLP VRAS LPHR LNM L 

RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QS KPGC YDNGKHYO T NOftWFPTVT fiWAT . vptp vrzn qop mine o v 

PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFOHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
E WKCERHTS VQTTS S G S G PFTD VRAAVYQPQ PH PQP P P YGHCVT 
DSGWYS VGMQLA* KTQGNKQML \ CTCLGNG VS CQE TAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
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SEQ ■ 
ID 

\rr\ • 

.wu : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
I.-Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q»Glut amine, R»Arginine, 
S»Serine, T-Threonine, V»Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYSFCTDHTVLVQTRGGNSNGALCHFP^YNNH^VTt)CTSEGRR ' 
DNMKWCX3TTQNYDADQKFGFCPMAAHEE I CTTNEGVM YR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTPHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDS WE KYVHGVR YQC Y C YGRG I GE WHCQPLQT YPS SSG PVEVF I 

tetpsqpnshpiqwnapqpshiskyilrwrpknsvgrwkeatip 
ghlns yt i kglkpg wyegql is i qqyghq evtrfdfttts t£ t 
pvtsnt wtgettp fs plvatses vteitass fwswvs asdt v 
sgfrveyelseegdepqylvlpstatsv\nip\dllpgrkyivn 
vyqisedgeqslilstsqttapdappdptvdqvddtsiwrwsr 
pqaprtgyrlvyspsvegsstelnlpetansvtlsdlqpgvqyn 
it i yaveenqest pw i qqettgt prsdtvps prdlqf vevtd v 
kvtimwtppesavtgyrvdvipvnlpgehgorlplsrntf\aen 
tgls pgvtyyfkvfavshgres kpiitaqqttkl\daptnlqfvn 
etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpase ytvslvai xgnqes pkatgvfttlqpgss i ppyn 
tevtettivitwtpaprigfklgvrpsqggeaprevtsdsgsiv 
vsgltpgveyvytiqvlrdgqerdap\rvnk\wtplspptnlh 
leanpdtg vl tvs wers tt pd itg yr i ttt ptngqqgnsle ew 
iiadqs s ctf\ dnlevpgle ynvsvytvkddkesvpisdti i pav 
pp ptdlrftn / 1 lgpdtmr vtw \ ap pp s i dltnflvr ys p vkne 
grmlqs ls i f fls dn\ awltnll pgte y ws vs s v yeqhe stp 
\lrgrqktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 
tgyrir\hhpbhf\sgrpredr\vphsrnsitltnltpgteyw 
sivalngreesplligqqstvsdvprdlewaatptslli\swd 
apavtvr yyri tygetggnsp vqeftvpgs kstatisglkpgvd 
yti tvyavtgrgds pas s kpi s inyrte idkpsqmgvtdvqdns 
1 9 vkwlpssspvtgyrvttt\pkngpg\ ptktktagpdqtemti 
eglqpt ve ywsvyaqnpsgesqplvqtavtn i drpkglaftdv 
dvds i kiawes pqgqvsryrvtyss pedgihelfpapdgeedta 
elqglrpgseytvswalhddmesqpligtqstaipaptdlkft 
qvtptslsaqwtppnvqltgyrvrvtpkektgpmkeinlapdss 
s vws glmvatkye vs vyalkdtlts r paqgwttlenvs pprr 

ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLR FLATTPNS LLV5 WQP P RAR I TG Y 1 1 KYEKPGS ? PRE WPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNIjHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 

sgqqps vgqqm i feehgfrrttppttatpirhrprpyppnvgqe 
alsqttiswapfqdtseyiischpvgtdeeplqfrvpgtstsat 
ltgltrgatyni ivealkdqqrhkvreewtvgnsvneglnqpt 
ddscfdpytvskyavgdekermsesgfkllcqclgfgsghfrcd 
ssrwchdngvnykigekwdrqgengqmmsctclgngkgefkcdp 

HEATCYDDGKTYHVGEQWQKEYIiGAICSCrCFGGQRGWRCDNCR 

rpggepspegttgqsynqysqryhqrtntnvncpiecfmpldvq 
adredsre 


5367 


235 


3591 


KKILNMLCKkNIVIEYLAtl^YEYLYGFCFSGIKKYLIIHVLRL 
ILEIiWMTRLLLEKSVSLQTQYLLLIVKILSWFPGKEMRHHLQIM 
EVMMRKQDS/RIVGNGSEQQLQKELADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPEMSLPVKPGQ 
GDSEASSPFTPVADEDSWFSKLTYLGCASVNAPRSEVEALRMM 
SILRSQCQISLDVTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
uiacHiMr KKijAAy i v Lib A TAAPQ TPDSD I FT FS VSLE I KEDDG 
KGYFSAVPKDKDRQCFKLRQGIDKKI VI YVQQTTNKELAI ER CF 
GLLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVITGSWNPKSPH 
FQVVNEETPKDECVLFMTTAVDLVITEVQEPVRFLLETKVRVCS P 
NERLFWPFSKRSTTEKFFLKLKQIKQRERKNNTDTLYEWCLBS 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEEEDNDEPLL 
SGSGDVSKECAEKILETWGELLSKWHIiNLNVRPKQLSSLVRNGV 
PEALRGEWJQLLAGCHNNDHLVEKYRILITKESPQDSAITRDIN 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aepartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G«=Glycine, 
H«Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R«Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=p0ssible nucleotide insertion) 








RTFPAHDYFKDTGGDGQDSLYKICKAYSVYDEEIGYCQGQSFIiA 
AVLLLHMPEEQAFSVLVKIMFDYGLRELFKQNFEDLHCKFYQLE 
RLMQE YI PDL YNHFLD IS LEAHM YASQWFLTLFTAKFPLYMVFH 
IIDLLLCEGISVIFNVALGLLKTSKDDLLJ.TDFEGALKFFRVQL 
PKRYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDP I ERFERENRRLQEANMRLEQENDDLAHELVTS KIALRKDLD 
NAEEKADALNK3LLMTKQKLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKG I SS TKEVLDEDTDEE KETLKNQL 
REMELELAQTKL\QLVEASCKIQD\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLS S I KTATG VQGKETC 


5368 


573 


2014 


GAAAGAADPRRGSLGGRTMLDFA I FAVTFLLALVGAVLYL YPAS 
RQAAGIPGITPTEEKDGNIiPDlVNSGSLHEFLVNLHERYGPWS 
FWFGRRLWSLGTVDVLKQHINPNKTUJ/L F *NHAEVI IKVS I W 
WWQCE * KP\ORKKLYBNGVTDSLKSNFALLLKLPEELLDKWIiS Y 
PETQH \ VPL S QHMLG FAMKS VTQM VMGSTFEDDQEV I RFQKNHG 
TVWS E IGKG FLDGSLDKKMTRKKQ YEDALMQLES VLRNI I KERK 
GRNFSQHI F IDS L VQGNLNDQQ I LED3 M I FS IiASC I 1 TAKLCTW 
AIWFLTTSEEVQKKLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTPVSAQLQDI EGKIDRF 1 1 PRETLVLYALGWLQDP 
NTWPS PHKFDPDRFDDELVMKTFS S LGFSGTQECPELRFAYMVT 
TVLLS VLVKRLHLLSVEGQVIETKYELVTSS REEAWITVS KRY 


5369 


1 


6622 


PRS LCFSLWAEAAVLiADGGLRRRRRLLRGTMSAS FVPNGASLED 
CHCNLFCLADLTGIKWKKYVWQG PTS AP ILFPVTEEDP ILSSFS 
RCLKADVI^/ VWRRDQRPERRE \ L * I FWGGEDP\ VLLTLFTMTY 
QKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRNFVRIGKWF 
VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 
LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQAFKMSDSATKK 
LIG E W KQFY P I S CCLKEMSEEKQEDMDWEDD S LAAVE VL VAG VR 
MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 
MS S VTLTPPTS P BE VQT VD PQS VQ KWVKFS S VSDGFNSDS TSHH 
GGKI PRKLANHWDRVWQE CNMNRAQNKRKYSAS SGGLCEE ATA 
AKVASWDFVEATQRTNCS CLRHKNLKSRNAGQQGQAPSLGQQQQ 
ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 
SQRLV\ISAP\DSQ\VRFSNIR\TNDVAK\TPQMHGTEMANSPQ 
PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 
EDR I D SLS QS FP PQ YQEAVE PT VYVGTAVNLE ED E AN IAWKYYK 
FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVS DELVQQYQl KNQCLSAIASDAEOEPK IDP YAFVEGDEEF 
LFPDKIODRQNSEREAGKKHKVEDGTSSVTVLSHEEDAMSLFSPS 
I KQDAPRPTSKARP PSTS L IYDSDLAVS YTDLDN LFNSDEDELT 
PGS KRSANGSDDKASCKE S KTGNLDPLSC ISTADliHKMYPTP PS 
LEQKIMG FS PMNMNNKE YGSM DTTPGGT VLEGNS S S IGAQ FK I E 
VDEGFCSPKPSEIKDFSYVYKPENCQILVGCSMFAPLKTLPSQY 
LPLIKLPEECIYRQSWTVGKLELLSSGPSMPPIKEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGSVKYENSDIiYS PASTPSTCRPLNSVE P 
ATVPS I PEAHSLYVNLILSESVMNLFKDCNSDSCCI CVCNMNI K 
GADVGV YI PDPTQEAQYRCTCG FSAVMNRKFGNNSGLFFEDELD 
IIGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDLILL 
LQDQCTNLFSrFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 
LEHGRQFMDNMS GGKVDEALVKS S CLHP WSKRNDVSMQCSQDIL 
RMLLSLQPVLQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 
TDESPEPLPIPTFLLGYDYDYIiVLSPFALPYWERIjMLEPYGSQR 
D IAYWLCPENEALliNGAKS V FRDLTAI YESCRLGQHRPVSRliL 
TDG I MR VGS TAS KKLSEKLVAE W FSOAADGNNEAFSKLKLYAQ V 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NTPS ATLAS AAS S TMTVTSGVAI STS VATANSTLTTASTSS S S S 
SNIiNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
QTSALQTAGISGESSSLPTQPHPDVSBSTMDRDKVGIPTDGDSH 
AVTYPPAI WYIIDPFTYENTDESTNSSS VWTLGIiLRCFLEMVQ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*»Alanine, OCysteine, D=Aspartic Acid, E*» 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V=* Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /epossible nucleotide deletion, 
\=possible nucleotide insertion) 








TL PPHIKSTVS VQI I PCQYLIiQ PVKHBDRE I YPQHLKSLAFSAP 
TQCRRPLPTSTNVKTLTGFGPGIiAMETALRS PDRPECI RLYAPP 
FILAPVKDKQTELGETFGEAGQKYNVLFVGYCLSHDQRWILASC 
TDLYGELLETCIINIDVPNRARRKKS5ARKFGLQKLWEWCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCGISAADSPSILSACLVAMEPQGSFVIMPDSVSTGSVFGRS 
TTLNMQTSQIjNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
APNPNNDGADGMGIFDLLDTX3DDLDPDIINILPASPTGSPVHSP 
GSHYPHGGDAGKGQSTDRLIjSTEPHEEVPNIIiQQPLAIiGYFVST 
AKAGPLPDWFWSACPQAQYQCPIjFLKASLHLHVPSVQSDELLHS 
KHSHPIiDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
FWIjNQTjY N F I MNML 


5370 


1226 


716 


RWSRKLEIiRRAAQATBSRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVETVQQLLEDGADPCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGIiGNTPLHLAACTNHVP VI TTLLRGGAR VDALDR 
AGRTPLHLAKSKLNILQEGHAQCLKAVR/HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSLALAESLSLFRACTSLPVG 
GCISWL 


5371 


1331 


1*7 


I AAMLWKLLLRSQSCRLCS FRKMRS P PKYRPFLACFTYTTDKOS 
SKENTRTVEKLYKCSVDIRKIRR\*KDGYF*RMKPMLKKLRI/P 
LQELGADETAVAS I LERCP EAI VCS PTAVNTQRKLWQLVCKNEE 
ELIKLIEQFPES FFT I KDQ SNQKLNVQFFQELGLKNW I S RLLT 
AAPNVFHN P VE KNKQM VR I LQES YLD VGGS EANMKVWLLKliLSQ 
NPF I LLNS PTAI KETLE FLQEQGFTS FE I LQLLS KL KG FL FQLC 
PRS IQNS ISFS KNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREGIS1AQIRETPMVLELTPQIVQYRIRKLNSSGYRIKDG 
H3uANLNGSKKEFEANFGKlQAKKVRPLFNPVAPLNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PtiRLL I tiLF VTELS GAHNTTV FQGVAGQS LQVS C P YDSM KHWGR 
RKAWCRQLGEKGPCQRWSTHNLWLLSFLRRWNGSTAITDDTLG 
GTLTI TLRNLQ PHDAGLYQ CQS LHGS E ADTLRKVL VE VLAD PLD 
HRDAGDLWFPG\DLRASRM?MWSTAS?GASWKEKSPSHPLPSFS 
SW PAS FSSRF * Q PAPSGLQPGMDRS QGHIHPVNWTVAMTQG I SS 
KLCQG 


5373 ■ 


2814 


346 


VKKTKSIFNSAMOBMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
GMLLDPTNPSAGTAKI DKQEKVKLNFDMTASP KI LMS KPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDLKBLSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKT IESCKAQLGINE ISEDVYTAVEHSDSEDS EKS 
DSSDSE Y I S DDEQ KS * GTSQEDTED KEGCQMD KE PS AVKKKP KP 
TNPVE I KEELKS TS PASEKADPGAVKDKASPE PEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSENNELVIDIjGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVIiTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSP VKKQRP LLPKE \ TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVS S VNGDL P I GTASADVAAD IAKYTS KL\ M DAI KGTM \TE I 
YNDLS KN\TTW KAQLAEDSQGLRI E IE KLQWLHQQ EL \ S EMKHN 
LEL TMAEMRQS WEQERDRL I AEVKKQLELEKQQAVDETKXKQWC 
ANFKKEAI F YCCWNTS YCD YPCQ\ QAHWPEH \MXS CTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS \TRSDHN/TPSTQHGRSLL PGKESRAGTP PLGTSK 


*374 


2dl4 


346 


VKKTKS iFNSAMQEMEVYVENIRRKTOVFNYSPFRTPYTPNrOX" 

QMLLDPTN PS AGTAK I DKQEKVKLNFDMTASPK I LMS KPVLSGG 

TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 

ESMDFLDKSTASPASTKTGQAGSLSGSPKPPSPQLSAPITTKTD 

KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 

IRSRFQLNLDKTIESCKAQLGINE ISEDVYTAVEHSDSEDS EKS 

DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPXP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firet 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
HaHistidine, I=Isoleucine, K«=Lysine, 
L= Leucine, M=Methionine, N=Asparaginc , 
P=Proline, Q-Glut amine, R^Arginine, 
S-Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNPVE I KEBLKS TS PAS BXADPGAVXDKAS PE PEKDFSG KAKPS 
PH P I KDKLKGKDETDS PTVHLGLDS DS E\NELVI DLGEDHSGRE 
GRKNK KE P KE PS PKQDWG KT P PSTTVGSHS P PETP VLTRSS AQ 
TSAAGATATTST3STVT VTAPAPAATGS P VKKQRPLLPKE \TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVS S VNGDLP IGTASADVAADI AKYTSKL\MDAIKGTM\TE I 
YlTOLSKN\TTMKAQIJ^DSO^LRIEIEKLQWLHO^EI,\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIPYCCWNTSyCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEADAE \VNTETLNKS SQGSS SSTQS APSETAS A\SKEKETSA 
EKSKESGSTLDLSGSRBTPSSILLGSNQGSDHSR\SNKSSWSSS 
DKKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1116 


HIF1AEEEPMLERRCRGPLAMGPAQPRLLSGPSQESPQTLGKES 
RGLRQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\LWLHTR 
RCQA/RGLPLPCPECGRRFRHAPFLALHRQVHAAATPDWGFACH 
LCGQSFRGWVALVLHLRAHSAAKAGPPACPKMARDAFWRRKAAS 
SSI LRRCHPSRPRGPRP FI CGNCGRS I LPTWDQ / LKVAH KRVHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKR FRHK\ PNL IR SHAACTSGER PHQ/CSRECG \ KR FTNKP Y 
LTS\HRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDP IEAPPSL YSCDDCGRS FRLERFLRAHQRQHTGERP FTCAEC 
GKNFGKKTHLVAHSRVHSGERPFRLARKCGRRFLPRASQSGGRN 
SAEPNAPRFGPFVCPDCGKAFRHKPYLAAHRPIATPAEKPYVCP 
DCRKAFSQKSNL\VSHRRIHTGERPYACPDCDRSFSQKSNLITH 
RKS H I RDGAFCCA I CGQTFDDEER LLAHQ KKHD V 


5376 


4504 


591 


VSTFSLCIjWPAGGGGRGRVSSMAQSKRHVYSRTPSGSRMSAEAS 
ARP LRVGSRVEV 1 G KGHRGWAY VGATLFATGKWVGVI LDEAKG 
KNDGTVQGRKYFTCDEGHG I FVRQS Q I Q V FEDGADTTS PETPDS 
S AS KV LKREGTDTT AKTS KLRGLKP KKAP TAR KTT TRRP KPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPI I PTP 
VLTSPGAVPPLPSPSKEEEGLRAQVRDLEEKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQQADLQRRLKEARKEAKEAL 
EAKERYMEEMADTADAIEMATIjDKEMAEERAESLQQEVEALKER 
VDELTTDLEILKAEIEBKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDLSSSEKQEHVK\LQKLMEKKNQELEWRQQRERLQEELSQ 
AESTI DELKEQVDAALGAEEMVEMLTDRNLNLEEKVRELRETVG 
DLEAMNEMNDBLQENARETELELREQLDMAGARVREAQKRVEAA 
QETVADYQQTIKKYRQLTAHLQDVNRELTNQQEASVERQQQPPP 
ETFDFKIKFAETKAHAKAIEMELRQMEVAQANRHMSLLTAFMPD 
SFLRPGGDHDCVLVLLLMPRLICKAELIRKQAQEXFELSENCSE 
RPGLRGAAGEQLSFAAIGLVY\SLMPAAGHRYHRY*CHALSQCR 
LD\VYKKVGSLYPEMSAJHERSLDFLIELLHKDQLDETVNVEPLT 
KAIKYYQHLYS IHLAEQP ED CTMQLADH I KFTQS ALDCMS VEVG 
RLRAFLQGGQBATDIAIiLLRDLETSCS\DIRQFCKI<IRRRMPGT 
DAPG1 PAALAFGPQVSDTLLDCRKHLTWWAVLQEVAAAAAQLI 
APLAENEG LL VAALEELAFKAS EQ I YGTPSSS P YECLRQS CNI L 
I STMNK\ LVTAMQEGE YDAERPPSKP P P \ VELRAAALRAE I TDA 
EGLGLKLEDRBTVIKELKKSLKIKGEELSEANVRLTLLEKKLDS 
AAKDADER I E KVQTR LE ETQALLRK KE KEFEETMDALQAD I DQL 
EAEJCAELKQRLNSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
GAIPGQAPGSVPGPGLVKDS PLLLQQ I SAMRLHISQLQHENS IL 
KGAQMKASLASLPPIiHVAKLSHEGPGSELPAGALYRKTSQLLET 
LNQLSTHTHWDITRTSPAAKSPSAQLMEQVAQLKSLSDTVEKL 
KDEVLKE1VSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFSCAAG FGQRH RL VLTQE QLHQLHS R LI S 


5377 


762 


1106 


DVPCKRVLPAEAQEKGQLTLSCGESGEEG\F*YHEVRQA^GES^- 
/ WFG PNVRL VHTQL KTKK PSGTLKAKFYLHTGSTKFAARI S CT 3C 
SS *WPGYDGWWGGQYI FIFRGMRWEEQP 


5378 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTSCLPAL 
R F I AT PRLSAMPH 1 DNDVKLD FKDVLLR PKRSTLKSRS E VDLTR 
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SEQ 
ID 

W> • 
wy • 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AcAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H^Histidine, I»lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, RaArginine, 
S=Serine, ^Threonine , V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






1 


S FSFRNSXQTYSGVP 1 1 AANMDTVGTFEMAKVLCKS * V PGS FWD 
V PQMGCVFL I Y KLFTLKWKM LLLS VLLPAS I LVAE KFSL FTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEA1P 
QVKriCLDVANGYSEHFVEFVKDVRKRFPQHTIMAGNVVrGEMV 
B B L I L5GAD 1 1 KVG I GPGS VCTTRKKTG VG YPQLSAVME CADAA 
HGLKGHIISDGGCSCPGDVAKAFGAOADFVMLGGMLAGHSESGG 
EL I ERDGKKYKLF YGMS S* I \AM \KKYAGGVAE YRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5379 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTSCLPAL 
R F I ATPR LS AMPH I DND VKLDFKD VL LR PKRS TLKS RS EVDLTR 
SFS FRNSKQTYSGVP I IAANMDTVGTFEMAXVliCKS * VPGSFWD 
VPQMGCV FLIYKLFTLKWKMLLLS VLLPAS I LVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
QVKYICLDVANGYSEHFVRFVKDVRKRFPQHTIMAGNWTGEMV 
EE LI LSGAD 1 1 KVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HG LKGHI I SDGGCS C PGD VA KAFGAGAD FVMLGGMLAGHS ESGG 
EL I ERDGKKYKLF YGMS S * I \AM\KKYAGGVAE YRASEGKTVEV 
PFKGDVEHT IRDILGGI RSTCT YVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5380 


2 


2050 


PSRAGGAERGRAAAARSPGGSAAGVJECPSVLDEAGACTMSSCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SFIWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGIiAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVS I TGMQDCVQLNQYTLKDE IGKGS YGWKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGP I \ EQVYQE IA\ I LKKLDHPNW\ KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKII 
H\RDI KPSNLL VGEDGHI KIADFG VSNE FKGSDALLSNTVGTPA 
FMAPE SLSE TRKI FSG KALD VW AMG VTL YC F VFG * C P FM DE RIM 
CLHSKIKSQALEFPDQPDtABDLKDLITRMLDKNPESRIWPEI 
KLHPWVTRHGA2PLPSEDSNCTLVEVTEEEVENSVKHIPSLATV 
ILVXTMIRKRSFGNPFEGSRRBERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCKPSCWQPPFLHTHSQPR 
*PEPPRTDBALCPYETGRTCWAPLLQVLWVfVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


S3 81 


2 


2050 


PSRAGGAERGRAAAARSPGGSAAGWECPSVLDEAGACTMSSCVS ' 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SFIWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVS I TGMQDCVQLNQYTLKDE IGKGS YGWKLA 
YNENDNTYYAMKVLSKKXLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI\EQVYQEIA\ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQAR FYFQDLIKGIEYLIIYQKI I 
H\RDI KPSNLLVGEDGHI KIADFGVSNEFKGSDALLSNTVGTPA 
FMAPE SLSETRK I FSG KALD VWAMGVTLYCFVFG*CPFMDERIM 
CLHSK I KSQ ALE FPDQ PD I AEDLKDL I TRMLDKNPESRI WPE I 
KLHPW VTRHGAE PLPSEDENCTLVEVTEEE VENS VKH I PSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
* PE PPRTDEALCPYETGRTCWA PLLQVLWWVGTPL PFPLSTS WL 
PDLVGAPGSHFCFLNIALLRYNSHTM ■ 


5382 


1536 - 


203 


GARGS QQDA P ALQEAE VRGP ERAQ PARGRMTKARL FRLWLVLGS " 

LTADSDVDEFLDKFLSAGVKQSDLPRKETEQPPAPGSMEESVRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DIPNSELSHLIVDDRHGAIYCYVPKVACTNWKRVMIVLSGSLLH 
RGAPYRDPLRI PREHVHNAS AHLTFNK FWRRYGKLSRHLM KVKL 
KKYTKFLFVRDPFVRLISAFRSKFELENEEF/* PQVRRAHAAAV 
RQPHQPARLGARGLPRWPQ\VSFANFIQYLLDPHT3KLAPFNEH 
WRQ VYRLCHP CQ I D YD FVGKLETLDEDAAQLLQLLQ VDLAAPLP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
u-ueucine, n— ween ion ine , N*Asparagine, 
P-Proline, Q«=Glutamine, R^Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








PELPGTGPPSSWEEDWFAKIPLAWRQQLYJCLYKADFVLFGyPKP 
ENLLRD 


5383 


45 


5250 


VERLLGC RNS KRTWRM L I S KNMP WRRLQG I S FGMYSAEE L KKLS 
VKSITNPRYLDSLGNPSAMGLYDLALGPADSKEVCSTCVQDFSN 
CSGHLGHI ELPLTVYNPLLFDKL YLLLRGSCLMCHMLTCPRAVI 
HLLLCQLRVLE VGALQAV YELER I LS RFLE ENADPS AS E IREEL 
EQYTTEIVQNNbl/jSGXSAHVKNVCESKSKljIALFWKAHMNAKRC 
PHCKTGRSWRKEHNSKLTITFPAI4VHRTAGQKDSEPLGIEEAQ 
IGKRG YLTPTS AREHLS ALWKNEGFFLNYL FSGMDDDGMESRFN 
PS VFFLD FLWP PSRS R P VSRLGDQMFTNG QT VNLQ AVMKDWL 
IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSFLSTLPGQ 
SLIDKLYNIWIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEKKE 
GLFR KHMMG KRVDYAARS VICPDMY INTNE I G I PMV F ATKLTYP 
QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 
QRE AVAKQLLTPATGAP KP QGTK I VCRHVKNGD I LLLNRQPTLH 
RPSIQAHRARZLPEEKVLRLHYANCKAYNAJDFDGDEKNAHFPQS 
ELGRAEAYVLACTDQQYLVPKDGQPLAGLIQDHMVSGAS MTTRG 
CFFTREHYMELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQWS 
TLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 
ESQVIIREGELLCGVLDKAHYGSSAYGLVHCCYEIYGGETSGKV 
LTCIiARLFTAYLQLYRGFTLGVBDI LVKPKADVKRQR 1 1 EESTH 
CGPQAVRAALNLPEAASYD3VRGKWQDAHLGKDQRDFNMIDLKF 
KEEVNHYSNE I NKACMPFGLHRQFPENTLQLMVQSGAKGSTVNT 
MQISCMjGQIELEGRSTPLMASGKSIjPCFEPYEFTPRAGGFVTG 
RFLTG I KPPEFFFHCMAGREGLVDTAVKTSRSG YLQRCI I KHLE 
GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 
SNYEVlMKSQHLHEVLSPJu^PKKALHHFRAIKKWQSKHPNTLLR 
RGAFLS YSQKI QEAVKALKLESENRNGR/RPWDS / G/RMLRMWY 
ELDEESRRKYQKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 
YSQEWAAQTE KS YEKSELSLDRLRTLLQL\KWQRS LCEPGEAVG 
LLAAQS IGEPSTQMTLNTFHFAGRGEMNVTLGI PRLREILMVAS 
ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 
ES FCMEEKQNKFQVYQLR FQFL PHAY YQQE KCLRPEDI LRFMBT 
RPFKLLMES I KKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 
EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
PSRP PDAAPETHPQPGAPGA\ EAMERRVQ AVRE I H PF I DDYQYD 
TEESLW CQVTVKL PLMK I NFDMSSIiVVSLAHGAVI YATKG I TRC 
IiLNETTNNKNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDIH 
Al ANT YG I EAALR VI E KEI KDVFAVYG I AVDPRH LS LVAD YMCF 
tL<*v x Jvt*i>WKfc^lRSNSSPLQQMTFETSFQFLKQATMLGSHDELR 
S PSACLWGKWRGGTGLFELKQPLR 


" 5384 ' 


19* 


886 


QSCGQRLPTVL+L*GPPGSCPCILSLF\PGRPHALPKIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
«r^w*vnr rrtPavwrK^l/Ujft&GEDryTLLFERVFVNLDGCFDMAT 
GQFAA PLRG I YFFSLNVHS WW YKETY VH I MHNQKKAVI L YAQPS 
ERS I MQS QS VMLDLAYGDRVWVRLFKRQR ENAI YSNDFDTY I TF 
SGHLIKAEDD 


' 5385 " 


326 


799 


iJi'jvrKj,iv^iu/u^«^fc J wui/u^KAZj\KAKKAVLKDW 
SPTFRRPKTL*LRRQPKYPWKSTPRRKKLDHHVIIKFPLTTE*A 
VKKI ENNS IiL VFTVDVKANKHQI KQAVKK / LCDID VAK VNTL J Q 
S DGE RKAY VR LA PDYDAL WATJCIGI T 


5386 


326 


799 ' 


LWVPRTKKEAPAPPKAEAKAKAL^KAlCKAVLKDVHSHKKNklHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKKI ENNSLL VFT VD VKANKHQ I KQAVKK/LCD ID VAKVNTL I Q 
SDGERKAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCWrWiGERRAGSLLSASYGTFAMPGMVLFGRRWAlA 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
LI VLMILLAWICTVS AI MCVSMRGTI CNPGPRKSMSKLL YIRL 
ALFFPEM VWASLGAAWVADGVQCDRTWNG I IATWVS WI I IAA 
T WS 1 1 1 VF0PLGGKMAP YSSAG PSHLDSHDS S QLLNGLKTAAT | 
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ID 

NU : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H»Histidine, I-Isoleucine, K-Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P«Proline, G>Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWBTRIKLLCCCIGKDDHTRVAFSSTABLFSTYPSDTDLVPSD " 
I AAGIALLHQQQ DNI RNNQ3PAQ WCHA PGS SQEADLD AE LKNC 
HH YMQ FAAAAYG WPL Y I YRNPLTG LCR I GGDCCRS KNPQTMT /M 
VGGDQLQL/ CTSAP ILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHRKESWVAVRGTMSLQDVLTDIiSAESEVLDVECEVQDRLAH 
KG I SQ AAR YVYQRL INDG I LSQAFS I APE YRLVI VGHS LG GG AA 
ALLATMVRAAYPQVRC YAFS PPRGLWSKALQEYSQSFI VS LVLG 
KDVI PRLSVTNLEDLKRRI LRWAHCNKPKYKI LLHGLWYELFG 
GNPNNLPTELDGGDQEVIiTQPLLGEQSLLTRWSPAYSFSSDSPL 
DSSPKYPPLYPPGRIIHW5EEGASGRFGCCSAAHYSAKWSHEAE 
FSKI L I GPKMLTDHM PD I LMRALDS WSDRAACVS CPAQGVS 5 V 
DVA 


S388 ~ 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYPFl'GGYRRRRAACQAEkPAARS 
KDTDLAAYQKGNLGVQLRNMAQE TNHS QVPMLCSTGCGF YGN PR 
TNGMCS VCYKEHLQRQNSSNGRIS P PVQCTDGS VPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5389 " 


1569 


753 


TADC^AGGGGR^QAGV^RJ[lYLVPFTGGV^AACQAERPAAk6 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGN ?R 
TNGMCS VCYKEHLQRQNS SNG RIS P P VQCTDGS VP 2AQS ALDS T 
SSSMQPSPVSNQSLLSESVASSQLDST3VDKAVPETEDVQASVS 
DTAQQ PS EEQS KS LE \NRNKKR IAVS CAGRKWD LLGLNAG VEM F 

TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5390 


217 


1332 


edprklmeokmwsecegpemslvcltdfqahareqLskstrdfi 
eggadds itrddni aafkri rlrprylrdvsevdtrttiqgee i 

SAP I CIAPTGFHCLVWPDGEMSTARAAQAA\G I CYITSTFAS CS 

LED I viaapeglrwfqlyvhpdlqlnkqliqrveslgfkalvit 

LDTPVCGNRRHDIRNQLRRNIiTIiTDLQSPICKGNAIPYFQMTPIS 

tslcwndlswfqsitrlpiilkgiltkedaeiavkhnvqgiivs 
nhggrqlde vlas i dalte waavkg ki evyldggvrtgndvlk 
alalgakci flgda i lwalas kgehgvkevlni ltnefhtsma\ 
ltgcrsvae inrnlvqfsrl 




1 


1292 


VKKAAGRSRGPPTAGGQRCEEAPGTVMERRLGVRAIVVKENRGS ? 
Q PPVCNKLMIIQEQLKVMFVGGPNTRKDYHIEEGEEVFYQIjEGDM 
VLRVLEQGKHRDWIRQGEIFLLPARVPHSPQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAPI IQEFFS 
SEQYRTGKPIPDQLI/KEPPFPLSTRSIMEPMSLDAWLDSHHREL 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDW7LWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
DPACKKS PWGEPSCHGLKAATGVPSTLEVPSLPNNSPS PHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHIiLWQTQPTAL 
P VLPGGLPPAPLLP I PLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 


I RGSNAQKWGASGSGGAG PQPDPAGPGGVPALAAAVLGACEPR " 
CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGS 
FXHKPAHGWLHPDARVLGPGVS YWRYMGCIEVLRSMRS LDFNT 
RTQVTREAINRLHEAVPGVRGS WKKKA PNKALAS VLGfCSNLRFA 
GMSISIH ISTDGLS hS VPATRQVI ANHHMPSI SFASGGDTDMTD 
YVAYVAKDP INQRACH I LECCEGL\AQS I ISTVGQAFELRFKQY 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNS IPGKEP PL 
vnju vuskuuj i UruUj Lf\±iU\J<j PS PSIiRDACSLPW DVGSTGTAP 
PGDGYVQADARGPPDHEEHLYVNTQGLDAPEPEDSPKKDLFDMR 
PFEDALKLHECSVAAGVTAAPLPLEDQWPSPPTRRAPVAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFESISHLIDHHLQNGQPIVAAE 
SELHLRGWSREP 


5393 


2 


982 


GGDSAGMTMK'I^^QNVCPRNLWIiLQPLTVIJjIJjASADSQAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDS1QWFHNG 
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SEQ 
ID 

Mr\ . 
wu : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H»Histidine, I-Isoleucinc/ K=Lysine, 
L-Leucine, M^Methionine, N*=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, TsThreonine, VsValine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *=-Stop 
Codon, /-^possible nucleotide deletion, 
\apossible nucleotide insertion) 








\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHIiTV 
LS EWLVLQT PH LE FQEGET I MLRCHS \ WRDKP \LVKVTF PQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSS PMG 1 I VAWIATAVAAI VAAWALI YCR KKRISAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 




2 


982 


ggdsagmtmetqmsqnvcprnlwllqpltvllllasadsqaaap 
pkavlkleppwrnvlq\edsvtltcqgapqp/ersdsiqwfhng 
\nlipthtqps\yrfkannn\dsgeytcqtgqtsl\sdpvhltv 
lsewlvlqtphlefqegetimlrchs\ wrdkp \lvkvtffqngk 
sqkfs hldptfs i pqanhshsgd yhctgn i g ytl fss k p vti tv 
qvpsmgssspmgiivawiatavaaivaawaliycrkkrisan 
stdpvkaaqfep pgrqmi ai rkrqleetnndyetadggymtlnp 
raptdddkni yltlppndhvnsnn 


539S 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVP I S KSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETP P VKE TQQEPDEESLVPSGENLAS BTKTES AKTEGP S PA 
LLEET PLE PAAG PKAACPLD SE SVEGWP P AS GGGRVQNS P P VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKK1GKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDI PIAKGTYTFD I DKWDDPNFNPFSSTSKMQESPKL 
PQOSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANG VDGDGLN KPAKKKKTPL XTDT FRVKKS PKRS PLSDP P SQDP 
TPAATPETPP VISAWHATDE EKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFS5 PTEELDYRNS YEIE YMEKIGSSLPQD 
DDAPKKQALY LMFDTSQES P V KS S P VRM SESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIEITAPEGSFASADALLSRLAHPVSLOGALDYIiEPDLAEKN 
PPLFAQ KLQREAAHPTD VS I S KTAL YSR IGTAEVE KPAG LLFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMSMRKIVAE 
YEKT I AQM I EDEQREKS VS \ HQTVQQLVLEKEQa\ LADLNS VEK 
\SLADLFRRYBKMKEVLEGFRKNEBVLKRCAQEYIiSRVKKEEQR 
YQALKVHA\EEKLDRANAE\IAQVRGKAQQEQAAHQASLAERSS 
CRV\ DALERTLEQKNKE I E ELTKI CDEL I AKKGKS 


5396 


3135 - 


531 


RASDAKNQEGLLNTRRK5TDS VP jl^KSTLSRSLS LQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDS ES VEG WP PASGGGRVQNS P P VG 
R KTLPLTTAPEAGE VTPSDS GGQEDS PAXGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPXMKKTPEKLDNTPAS? 
PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQD? 
TPAATPETPPVISAVVHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYWEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQE S HLQVPEKS SQ KELE AMGLG TP 
S EAI EI TAPEG S FA SAD ALLS RLAH PVSLCG ALD Y L E P DLAE KM 
P PLFAQKLQREAAHP TDVS I S KTAL YSR I GTAE VEKPAGLLFQQ 

pdldsalqiaraei itkerbvsewkdkyeesrrevmemrkivae 
yektiaqmiedeqreksvs\hqtvqqlvlekeqa\ladlnsvek 

\ S LADLFRR YEKMKE VLEGFRKNEE VL KRCAQE YLSR VKKEEQR 
YQALKVHA\EEKLDRANAE\ I aqvrgkaqqeqaahqaslaerss 


5397 


3135 - ■ 


531 


RASL^KJNQKGIJjNTRRJCSTDSVPISKSTLSRSLSLO-ASDFDGAS " 

ssgnpeavalapdaystgsssasstlkrtkkprppslkkkqttk 
kptetppvketqqepdeeslvpsgenlasetktesaxtegpspa 
lleetplepaagpkaacpldsesvegwppasgggrvqnsppvg 
rktlplttapeagevtpsdsggqedspakghsvrlefdysedks 
swdnqqenppptkkigkkpvakmplrrpkmkktpekldntpasp 
prspaepndi p iakgt ytfdi dkwddpnfnp fsstskmqespkl 
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SEQ 
ID 
NO: 


Predicced 
beginning 
unci f*r>h"i Hp 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

iOCaLlOIl 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
Hs=Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S-Scrinc, T -Threonine, V~Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknovn, *==Stop 
Codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion) 








PQQSyNFDPDTCDESVDPFKTSSkTPSSPSKSPASFEIPASAMB 
ANGVDGDGLNKPAKKKKTPLKTDTPRVKKSPKRSPLSDPPSQDP 
TPAATP ETPPV I S AWHATDEE KLAVTNQKWTCMT VDLEADKQD 
YPQPS DLSTFVNETKFSS PTEELD YRNS YE IEYMEKIGS SLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGIiAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIEITAPEGSFASADALLSRIAHPVSLCGALDYLEPDLAEXN 
PPLFAQKliQREAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKEREVSEWKDKYEESRREVMEMRKIVAB 
YEKTIAQM I EDEQREKS VS \HQTVQQLVLEKEQA\ LADLNS VEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\ BEKLDRANAE \ IAQVRGXAQQEQAAHQAS LAERS S 
CR V\ D ALERTLE QKNKE I EELTKICDEL IAKMGKS 


5398 


56 


5426 


SG B VCRMESNFNQEG VPR PS YVFSAD P I ARPSE I N FDG I KLDLS " 

HEFSI»VAPNTEANSFESKDYIjQVCLRIRPFTQSEKELESEGCVH 

ILDSQTWLKEPQCILGRI^EKSSG\QM\AQKFSFPPGFLGPAT 

TQKEFFQGCIMHP\VKDLLKGQSRLIFTYGLTNSGKTYTFQGTB 

ENIRILPRTLNVLFDSLQERLYTKMNLKPHRSRBYLRLSSEQEK 

EEIASKSALLRQIKEVTVHNDSDDTLYGSLTNSLNISEFEESIK 

DYEQANLNMANSIXFSVWVSFFEIYNEYIYDLFVPVSSKFQKRK 

MLRLS QDVKG YS F I KDLQ W IQVS DS KEAYRLLKLG I KHQS VAFT 

KLNNASSRSHSIFTVKILQIEDSEMSRVIRVSELSLCDLAGSER 

TMKTQNEGERLRETGNINTSLLTLGKCINVLKNSEKSKFQQHVP 

FRESKLTHYF/ QSFFNGKGKICMIVNI SQCYLAYDETLNVLKFS 

AI AQ KVCVPDTLNS SQEKLFG P VK3 SQDVSLDSNSNS K I LNVKR 

ATI SWENS LEDLMEDEDLVEELEKAEETED/VGETKLLDEDLDK 

TLEENKAFrSHEEKRKIiliDLIEDLKKKLINEKKEKLTLEFKIRE 

EVTQEFTQYWAQREADFKETLLQEREI LEENAERRLAI FKDLVG 

KCDTREE AAKDI CATKVE TE E ATACLELKFNQ I KAE LAKTKGEL 

I KTKEELKKRENESDSLIQELETSNKKI ITONQRI KEL INI IDQ 

KEDTINEFQNLKSHMENTFKCNDKADTSSLIINNKLICNETVEV 

PKDSKSKICS3RKRVNENBLQQDEPPAKKGSIHVSSAITEDQKK 

S EEVR PNI AE I ED IR VLQENNEG LRAFLLTI ENELKNEKEEKAE 

LNKQIVHFQQ3LSLSEKKNLTLSKEVQQIQSNYDIAIAELHVQK 

SKNQEQEEKIMKLSNEIETATRSITNNVSQIKLMHTKIDEtiRTL 

DSVSQISNIDLLNLRDLSNGSBEDNLPNTQIiDLLGNDYLVSKQV 

KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 

QIEKLQAEVKGYKDENNRLKEKEHKNQDDLLKEKETLIQQLKEE 

LQEKNVTLDVOIQHWEGKRALSELTQGVTCYKAKIKELETILE 

TQKVERS H S AKLEQD I LE KES 1 1 LKLBRNLKE FQEHLQDS VKNT 

KDLNVKELKLKEEITQLTNNLQDMKHLLQLKEEEEETNRQETEK 

LKEELSASSARTQN\I,NADLQR1CEEDYADLKEKLTDAKKQIKQV 

QKEVSVMRDEDKLLRI KINELEKKKNQCSQELDMKQR\TIQQLK 

EQL INQKVEEAI QQYERACKDLNVKEKI I EDMRMTLEEQEQTQV 

EQDQVL\EAKLSEVERLATELDRWRVXCNDLETKNN<2RSNKEHE 

NNTDVI^KLTNLQDELQESEQKYNADRKKWLEEIWL ITQAKEA 

ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 

ERDQLVAALE I QtjKAL I SSNVQKDNE I EQL KRI ISETSKIETQI 

MDIKPKRISSADPDKLQTEPLSTSFEISRNKIEDGSWLDSCEV 

STENDQSTRFPKPELB I QFTPLQPNKMAVKHPGCTT P VTVKIPK 

ARKRKSNEMEEDLVKCENKKNATPRTNLKFP1SDDRNSSVKKEQ 

KVAIRPSSKKTYSLRSQASIIGVNLATKKKEGTLQKFGDFIiQHS 

PS I LQS KAKKI I ETMSS S KLSNVEASKENVSQPKRAKRKLYTSE 

ISSPIDISGQVILMDQKMKESDHQI IKRRLRTKTAK 


5399 


705 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 
AS P T PGEVQRHLQTHG IDGNGELD FST FLTIMHMQ I KQEDP KKE 
ILLAMLMVDKEXKGYVMASDLRS K I/TS LGE KLTHKEV \ DDL FRE 
\ADI EPNGKVKYDEFIHKI TS YLDGTY 


5400 


931 

.... 


248 


SHCSSGMEIPPTNYPASRAALVAQNYINYQQGTPHRVFEVQKVK 
QASMEDIPGRGHKYRLKFAVEEIIQKQVKVNCTASVLYPSTGQE 
TAPEVNFTFBGETGKNPDEEDNTFYQRLKSMKEPLEAQNI\PDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, X^Lysine, 
^Leucine, MaMethionine, NsAsparagine, 
PaProline, Q=Glutaraine, R=Arginine, 
S=Serine, ^Threonine, V-Valine, 1 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«pcssible nucleotide insertion) 








FGNVSPEMTLVLHIoAWVACGYI I WQNSTEDTWYKMVKlQTVKQV 
QRNDDFI ELDVTILLHNIASQEI IPWQMQVLMHPQYGTKVKHNS 
RLPKEVQIiE 


5401 


3 


1360 


TGWSYGPTTSLAFLAPRDF PF P PKLLI H PQAWRLSCGAGSMGS 
QAAAEWRNWASWEGSSSLSGCSMGCFKDDRIVFWTWMFSTYFME 
KWAPRQDDMLFYVRRKLAY SGS E SGADGRKAAEP E VE VE VYRRD 
SKKLPGLGDPDIDWEESVCIjNLILQKLDYMVTCAVCTRADGGDI 
HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\EE\ 
VFSDMTVGKGEMVCVELVASDKTNTPQGVI FOGS I RYEALKKVY 
DNRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHAEMA 
VSRVSTGDTS PCGTEEDS SPAS PMHERVTS FSTPPTPERNNRPA 
FFS PSLKR KVPRNR 1AE MKKS H SANDS EE F FR3DDGGADLHKAT 
NLRSRS LSGTGRSLVGSWLKLNRADGNFIiLYAHLTYVTLPLHR I 
LTDILEVRQKPIIiMT 


5402 


3445 


1563 


GECFIMAA WQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADILNS YAGLAC VEEPNDM I TESSU5VAEEE I IDDDDDDI TL 
TVEASCHDGDET1ETI EAAEALLNMDSPGPMbDEKRINNNIFSS 
PEDDMWAP VTHVS VTLDGI PEVMETQQVQEKYADS PGASS PEQ 
PKRKKGRKTKPPRPDS PATTPNI S VKKKNKDGKGNT I YLWE FLL 
ALLQDKATCPKYXKWTQREKGIFKIjVDSKPVSRLWRKHKNKP\D 
MNYEPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDL1YINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKP KDP VEVAQPS EVLRTVQPTQSPYPTQLFRTVHWQ 
PVOAVPEGEAARTSTMQDETLNSS VQS IR \TI QAPTQVPVWS P 
RNQQ\LHTVTLQrVPLTTVIASTDPSAQTGSQKFXLQAIPSSQP 
MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 
VS V\AS S PS FS \ ATAP WTLFLLGSSQLVAHPPGTVITS VI ICIQ 
ETK7LTQE VEKKE SEDHLKENT EKTEQQPQP YVMWS S SNG FTS 
QVAMKQNELLEPNSF 


5403 


3445 


1563 


GEC FI MAAWQQNDLVFEFASNVMEDESQLGDPAI FPAVI VEHV 1 
PGADI LNS YAGLACVE E PN DM I TES SLD VAEEE 1 1 DDDDDD I TL " 
T VEAS CHDGDET I ETI E AAEALLNMDS PGPMLDEKRINNNI FSS 
PEDDMWAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPEQ 
PKRKKGRKTKPPRPDS PATTPNI SVKKKNKDGKGNTIYLWEFIiL 
ALLQDKATCP KY I KWTQRE KG I FKLVDS KP VS RLWRKH KNKP \D 
MNYE PMGRALR Y YYQRG I t*AK VEGQK L V YQ FKEM PXDL I YI NDE 
DPSSS I ESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATT VI*K 
PGNSKAAIO?ICDPVEVAQPSEVLRTVQPTQSPyPTQLFRTVHVVQ 
PVQAVP EGEAART S TMQDETLNSSVQS I R\ TIQAPTQVPVWS P 
RNQQ\ LHTVTLQT V PLTTVI AS TOPS A0TGSQKF I LQAI P S SQ P 
MTVLKENVMLQSQKAGSPPSIVI/5PARV\QQVJjTSNVQTICNGr 
VSV\ASSPSFS \ATAPWTLFLLGSSQLVAHPPGTVITSVI KTQ 
ETKTLTQEVEKKE S EDHLKENTEKTEQQ PQ P YVMWS S SNG FTS 
QVAM KQNELLE PN S F 


5404 


187 


1111 


LPVTL1 FAKMKTLQSTLIjLLLLVPLIKPAPPTQQDSRI I YDYGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI IPNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCIjSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNIjRRLDFTGNLIEDIEDGTFSKL 
SLVEBLSLAENQLLKLPVLPPKLTLFNAKYNKIKSRGIKANAFK 
KLNNLTFLYLDHNALESVPLNLPESLRVIHLQFNNIASITDDTF 
CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


54 05 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
1 JjisLJjyx KJ\1 KtjSNE x TEGF S W lUuPAPRTAPRQEKHERTHEI 1 
P INVNNN YEHRHTSHLGHAVLPSNARGP I LSRS TS TGS AASS GS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTGHKFICEQCGKCKCGECTAPRTLPSCLACNRQCLCSAE 

smveygtcmcl\vkgifyhcsnddegdsysdnpcscsqshccsr 
ylcmgamslflpcllcyppakgclklcrrcydwihrpgcrckns 
ntvycklescpsrgqgkps 


5406 


279 


2732 


rwrtynvegpltfmdvaiefcleewqcldtaqqnlyrnvmleny 
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SEQ 
ID 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q^Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG/ 1 IAVSKPDLITCIiBQEKEPWEPMRRHEMVAKPPVMCf - 
SHPTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDE 
CKVHRGG YNGFNQCL PATQSK I FLFDXC VKAFHKFSNSNRHKI S 
HTEKKLFKCKECGKSFCMLSHLAQHXIIHTRVNFCKCEKCGFCAF 
NCPSIITKHKRINTGEKPYTCEBCGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS ILTTHKI 1 RTGEK F YXCXECAKAFNQS S 
NLTEHKKIHPGEKPYKCEECX3KAFNWPSTLTKHKRIHTGEXPYT 
CEECG KAFNQFS NLTTHKR I HTA\ EKF X KCTECGEAFSRS \ S NL 
TKHKEIHTEKKPYKCEBCX3KAFKWSSKLTEHKLTHTGEKPYKCE 
KCGKAFNCPS I ITKHNRINTGEKPYTCEECGKVFNWSSRLTTHK 
KNYTRY KLY KC E ECGKAFNKSS I LTTH KKIH I EKX FYKCEE CGK 
AFKWSSKLTBHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 
GEKPYKCEECGKAFTQSSNiiTTHKKIHTGEKFYKCEECGKAFTQ 
SSN LTTH KK I HTGGKP YKCEECGKAFNQFSTLTXHKI IHTEE KP 
YKCEECGKAFKWSSTLTKHKIIHTGEKPYKCBECG\KAFKLSST 
LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGBQPYKC 
EECGKAFNYS SHLNTHKR IHTKEQP YKCKECGKAFNQYSNLTTH 
NX I HTG EKL YKPED VTVI LTTPQTFSNI K 


5407 




659 


RFRRRQSSCCTG WLAGWLLRAAPR FCRRTETDMEQGKGLAVL 1 1* 
All LLQGTLAQS I KGNHLVKVYDYQEDGS VLLTCDAEAKN ITWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAEIVSIFDliAVGVYFIAGTGMEFR 
QS\RASDKQTLLP\NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 


" *408 




6128 


O^SKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRLP 

HAR QHTPLPLGSADYRR WS VR PQGPHRDPKDS RDAAKREQGS L 

APRPVPASRGGKTLCKGYRQAPPGPPAQFQRP I CSAS PPWASRF 

STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 

RLPTDliDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 

VRNKDMSWPEEMS FIANSS KIDRHKVPTEKGATGLS NLGNTCFM 

NSSlQCVSNTQPLTQYFISGRHLYELNRTbJPIGMKGHMAKCYGD 

LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 

LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 

IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 

EITVI KLDGTTPVRYGLRLNMDEKYTGLKKQIiSDLCGLNSEOI L 

LAEVHGSN1 KNFPQDNQKVRLS VSGFLCAFE I P VPVS PISAS S P 

TQTDFSSS PSTNEMFTLTTNGDLPRPI FI PNGMPICTVPCGTEX 

NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFIjSSQKNR 

PSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEASNH 

AQDCDDSMGYQYPFTIiRWQKDGNSCAWCPWYRFCRGCKIDCGE 

DRAF I GNAYI AVDWHPTALHIiR YQTSQER WDEHES VEQSRRAQ 

VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 

LP PIL 1 I HLKR FQF VNGRW I K$Q XI VKF PR ES FDPS AFLVPR DP 

ALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKS 

PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 

KGRLRLPQIGSKNKLSSSKENLDASKENGAGQICELADALSRGH 

VLGGSQPELVTPQDHEVALANGFLYEHBACGNGCGNGYSNGQLG 

NHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKN 

PNCKW YC YNDS S C KE LH PDE I DTDSAY I LFYEQQG I D YAQFL P K 

TDG KKMADTSS MDED FE SDY\ EKYCVLQ 


5409 


2745 


6128 


QGSKGTCHPOAgQPWDEGVWQEXP^0s^PWGQSQEPPTi^PQRLP~~ 
HARQHTPLPLGSADYRRWSVRPQGPHRDPXDSRDAAKREQGSL 
APRP VP AS RGG KTLC KG YRQ AP PG P PAQ FQRP I CSAS P P WAS R F 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
i\ur lULiuivjufwr rttlUr EKbeWKAlSyEDQIATXrWQAEHCGE 
VRIJKDMSWPEEMSFIANSSKIDRHKVPTBKGATGLSNLGNTCFM 
NS S I QCVSNTQPLTQY F I SGRHLY ELNfRTN P 1 GMKGHMAKCYGD 
LVQE LWSGTQKNVAPL KLRWT I AKYAPRFNGFQQQDS QELLA PL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRMRS 
I WDLFHGQ LRSQVKCKTCGH I S VRFDP FNFL SL PL PMDS YMH L 
EITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQIL 
LAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSP 
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Amino acid segment containing signal peptide 
(AteAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G-Glycine, 
H-Histidine, I=Isoleucine, K»Lysine, 
L«Leucine, M»Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine , V^Valine, 
w=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




TQTD FS S S PS TNEM PTLTTNGDLPRP I F I PNGM PNTWP CGTE K " 
NFTNGMVNGHMPSLPDS PFTGYI I AVHRKMMRTELYFLSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAWIQVSRLASPLPPQEASNH 
AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERVVbEHESVEQSRRAQ 
VE P I N LDS CLRAFTS EE ELGENEM YYCS KCKTHCIATKKLDLWR 
LPPILI IHLKRFQFVNGRW I KSQKI VKFPRES FDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRI LAREVKKVDAQSSAGEEDVLLSKS 
PSSLS AN 1 1 S S PKGS PS S SRKSGTS C P S SKNS S PNS SPRTLGRS 
KGRLRLPQIGSKNKLSSS KENLDAS KENGAGQ I CELADALSRGH 
VIjGGSQ P ELVTPQDHE VALANGFL YEHEACGNGCGNG YSNGQLG 
NHSEEDSTDDQREDTRI KP I YNLYAI SCHSGILGGGH Y VTYAKN 
PNCKWYC YNDSSCKELH PDE IDTDSAYI LFYBQQG I DYAQFLPK 
TDGKKMADTS S MDEDFES D Y \ EKYC VLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHL YKLLiVIGDLGVGKTS 1 1 KRY 
VHQNFSSHYRAT I GVDFALKVLHWDPETWRLQLWDI AGOERFG 
NMTRVYYREAMGAF I V FD VTR P AT FE AVAKWKNDLDSKLS L PNG 
KPVSVVLLANKCDQGKDVLMNNGLKMDQFCKEHGFVGW FETSAK 
ENINIDEASRCLVKHI LANECDLMES lEPDVVKPHIiTSTKVASC 
SG\CAKI LVGTFAGVW 


5411 


1302 


239 


TGPAAAGRRKALGSFGKPSPVTGLRAARRRRTRPSAPAAPSVGC 
GKRRESDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQPN 
GGEF PLIGVSGGTASGKS SVCAKI VQLLGQNEVDYRQKQ WILS 
QDSFYRVLTSEQKAKALKGQFNFDHPDAFDNELILKTLKEITEG 
KTVQI PVy DFVSHSRKEETVTVYPAD WLFEGI LA 7 YSQER / IR 
DLFQM KL FVDTDADTRLSRR VLKD I S ERGRDLEQI LSSSTLR FV 
KPA\ FEE FCLPPK\KYADVI I PR\GADN\R VPINL I VQH I Q \D I 
LNGGP S \NRQTNGCLNGYTPSRKRQASES SSRPH 


5412 


3180 


313 


QGISNFFHKEANFWFEVSGYLISPLRSPFVDPALEWSLMASPWN 

KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHEIFRDSSLV 

NEQSQITRRKKRKKDFQHIilSSPliKKSRICDETANATSTLKKRK 

KRRYSALE VDEEAGVTWLVDKENINNT P KHFRKDVDWCVDMS 

IEQKLPRK\PKTDKFQVLAKSH\AHKSEALHSKVREKKNKKHQR 

KAASWESQRA\RDTLPQSEFPTQEESWLSVGPGGEITELP\ASA 

HKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 

GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESIAMPEGSQVGS 

EVGADMQES\RPAVGLHGETAG1PAPAYKNKSKKKKKKSNHQEF 

EAVAMPESLES AYPEGSQ VGS E VGTVEGS TALKG FKESNSTKKK 

SKKRKLTSVKRARVSGDDFSVPSKNSBSTLFDSVEGBGAMMEEG 

VKS RPRQ KKTQ ACLAS KHVQE APRLE P ANEEHNVETAEDS B I RY 

LSADSGDADDSDADLGS AVKQLQEFIPNI KDRATST I KRMYRDD 

LERFKEFKAQGVAIKFGKFSVKENKQLEKNVEDFLALTGXESAD 

KLLYTDRYPEEKSVITNLKRRYSFRLHIG\RNIARPWiCLlYYRA 

KKMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 

SVALKFSQISSQRNRGAWSKSETRKLIKAVEEVILKKMSPQBLK 

E VDSKLQ ENP ESCLS I VREKL YKG I SW VE VEAKVQTRNWMQCKS 

KWTEILTKRMTNGRR I Y YGMNALRAKVS L I ERLYE INVEDTNE I 

DWEDLAS At GDVp PS YVQTKFS RLKAV YVP FWQKKT F P E I ID YL 

YETTLPLLKEKLEKMMEKKGTKIQTPAAPKQVFPFRDIFYYEDD 

S EGGGHRKRKRRPRRHAWFTP V I PVLWEAKAGWI I 


5413 




1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPLLRRTARPGGG 
TPIiLNfGAGPGAARQSPRSALFRVGHMSSVKLDDELLEP\DMDPP 
HPFPKEIPHNE KLL SLKY ESLD YDNSENQLFLE EERR I NHTAFR 
TVEIKRWVICALIGILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTEKGGLSFSLLLWATLNAAFVLVGSVIVAFIEPVAAGSGIPQ 
IKCFLNGVKI PHWRLKTLVI KVSGVILSWGGLAVGKEGPMIH 
SGSVIAAG ISQGRSTSLKRDFKI FE YLRRDTBKRDFVS AGAAAG 
VSAAFGAPVGGVLFS LEEGAS FWNQFLTWR I FFASMI STFTLNF 
VLS I YHGNMWDLSS PGLINFGRFDSEKMAYTIHEIPVFI AMGW 
GGVLGAVFNALNYWLTMFRIR YIHRPCLQVI EAVLVAAVTATVA 
FVLIYSSRDCQPLQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Xsoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q~Glu t amine , R=Arginine, 
S=»Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«paseible nucleotide insertion) 








ovvaufftiurt^a xwjf iti lj^Lifc'T.LVY FFLACWTYGLTVSAGVFI P 
SLL:GAAWGRL?GISLSYLTGAAIWADPGKYALMGAAAQIiGGIV 
RMTkfl LTVIMMEATSNVTYQFP IMLVLMTAKI VGDVFIEGLYDM 
HIQI^SVPPLHWEAPVTSHSLTAREVMSTPVTCLRRRBKVGVIV 
DVLSDTASNHNG FP VVEHADDTQP ARLQGL I LRS QL I VLLKHKV 
FVERSNIX3LVQRRLRLKDPRDAYPRFPP IQS IHVSQDERECTKD 
LSBFMMPSPYTVPQEASLPRVPKLFRALGLRHLWVDNRNQWG 
LVTRKDIARYRLGKRGLEELSLAQT 


" $414 - 


2130 


390 


GYASAWDRALPS PLLS PTSRVFRTS P PRCVSTETGRRDRARVPS 
QWCSVLQGKLPVSGRTSLACVRSILLSPASSPRKVGIVGGTGAR 
AGAAPRDHGRVRHRRPSSARRMTRTTGQCLAPRGCQGPRGTRSP 
RSP RS RTRRG CS ASPACLP/ CRS AL I VAVLC Y INI iLNYMDRPTV 
AGVLPDIEQFFNIGDSSSGLIQTVFISSYMVLAPVFGYLGDRYN 
RKYLMCGG I AFW S LVTLGS S F I PGEH PWLliLLTRGLVGVGEAS Y 
ST IAPTL1ADLFVADQRS RMLS I F YFAI P VGSGLG YIAGSKVKD 
MAGDWHWALRVTPGLGWAVLHiFLWREPPRGAVERHSDLPPL 
NPTSWWADLRALARNPSFVIiSSLGFTAVAFVTGSLALWAPAFLL 
RS R WLGETP PCLPGDS CSS SDSLI FGLITC LTG VLG VGLGVE I 
SRRLRHSNPRADPLVCATGIiLGSAP FLFLSLACARGS IVATYI F 
I FIGETLLSMNWA I VAD I LL Y WI PTR R STAEAFQ I VJj SHLLGD 

AGSPYLIGLISDRLRRNWPPSFLSEFRA1^FSLMI.CAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


IPPKTKLELQKH\LTTLT\NQEQATIFEEVQKLRPRNEQRENEIj 

iisplrclfeekqkehihigemkqtsqmaaeiflgselppsarrf 
rldmlknkakrslteslesilsrgnkarglqrhsisvdldssls 

STLSNTS KEPS VCE KEALP I S E SS P KLLGS SEDLS SDSESHLPE 
EPAPLSPQQAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLM 
RYHSVS TET PH EK K DFES KANHLGDSGGTPVKTRRHSWRQQI FL 
RVATPQ KACDS S SR YED YSE LGELP PRS PLEP VCEDGPFG PPPE 
EKKRTSRELRELWQKAILQQILLLRMEKENQKLQASENDLLNKR 
LKLDYE E I TPCLKE VTTVWEKMLSTPGRS KIKFDME KMHS AVGQ 

gvpVrhhrgeiwkflaeqfhlkhqppskqqpkdvpykellkqlt 

SQQHAIL IDLGRTFPTHP YFS AQLGAGQLSLYN I LKAYSLLDQE 
VGYCQGLSFVAGILLLHMSEEEAFKMLKFI^IFDMGLRKQYRPDM 
IILQIQMYQLSRLLHDYHRDLYNHLEEHEIGPSLYAAPWFLTMF 
ASQFPLG FVARVFDM I FLQGTE VT FKVALSLtiGSHKPLILQHEN 
I/ETIVDFIKSTLPNLGLVQMEKTINQVFEMD1AK0LQAYEVEYH 
VLQEEI* I DS S PLS DNQRMDKLE KTNS SLRKQNLDLLEQLQVANG 
RIQSLEATlEKLLSSESKLKQAMLiTLELERSALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDI LSGDQDKEQKDPYFVETP YGYQLDLDFLK " 

YVDDI QKGNT IKRLN IQKRRKPS VPC PE PRTTSGQQG I WTS TES 

LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 

ENRQLPP PS PQLPKKNLHVTKTLMETRRRLEQERATMQMTPGE F 

RR PRLAS FGGMGTTS S LPS F VGSGNHNPAKHQLQNG YQGNGD YG 

SYAPAAPTTSSMGSSlRHSPLSSGlSTPVTNVSPMHIiQHIRBQM 

AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 

SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 

TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 

RS VAVGAE ENMND I WYHRGS RS CKDAAVGTLVEMRNCGVS VTE 

AMLGWTEADKEIELQOQTIESLKEKIYRLEVQLRETTHDREMT 

KLKQ E LQAAG S RK KVDKATMAQ P L VFS KWEA WQTRD Q MVG S H 

MDL VDTC VGTS VETNS VGISCQ PECKNKWG PELPMN W W I VKER 

VEMHDRCAGRSVEMCDKSVSVBVSVCETGSNTEESVNDLTLLKT 

NliNLKEVRS IGCGDCSVDVTVCS PKECAS RGVNTEAVSQVE AA V 

MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 

TSTQTVETRTVAVGEGRVKDIMSSTKTRSIGVGTLLSGHSGFDR 

PSAVXTKESGVGQININDNYLVGLKMRTIACGPPQLTVGLTASR 

RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 

TLIAENYSELAEAFGEPHSWGSLNSQLISTLSSINSVMKSAST 

EEIiRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPB 
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Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G~Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Laucine, M=Methionine, N=»Asparagine, 
P-Proline, Q^Glutamine, R=Arginine, 
S-Serine, T«Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X« Unknown, *=>stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEVGTSEG KP ISS LDA FPTQEGTLS P VNl^TDDQ IAAGL YACTNN 
E S TLKS I M KKKDGNKDSNG AKKNLQF VG I NGG YETTS SDDSS SD 
ESSSSESDDECDVIEYPLEEEEEEEDBDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVBIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMR FCLNTLQHEWFR VS SQKSA IPAMVGDYIAAFEA I 
SPDVLRYVINLADGNGNTALHYSVSHSNFEIViCJbLI»DADVCNVD 
HQNXAGYTP I M1AALAAVEAEKDMRI VEELFGCGDVNAKASQAG 
QTALMLAVSHGR I DMVKGLLACGAD VN I QDDEGSTALMCASEHG 
HVEIVKLLLAQPGCNGHIiEDNDGSTALS I ALE AGHKD I AVLL YA 
HVNFAKAQSPGTPRLGRKTS PG PTHRGS FD 


5417 


27 


4074 


KSQLFCFWGGXAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
YVDDIQKGNTI KRLNIQKRRKPS VPCPEPRTTSGQQGI WTSTES 
LS SSNSDDNKQCPNFLIARSQVTSTPI SKPP P PLE TSLPFLT I P 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTS S LPSFVGSGNHNPAKHQLQNGYQGNGDYG 
S YAP AAP TTS S MGS S I RHS PLS S G I STPVTNVS PMH LQH I REQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKR5YSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TV EQSTQR I KE FRQL \ TADMQALEQKIQDS S CE ASS ELRENGE C 
RSVAVGAEENMNDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AHLGVMTEADKEI E LQQQT I E S LKE KI YRLE VQLRETTHDREMT 
KLKQELQAAGS RKKVD KATMAQ PL VFSKVVEAVVQTRDQMVGSH 
MDLVDTC VGTS VETNS VG I SCQ PECKNKWG PELPMNW W I VKER 
VEMHDRCAGRS VEMCD KS VS VE VS VCETG SNTEES VNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKE CASRGVNTEAVSQVEAAV 
MAVPRTADQDTS TDL EQ VHQFTNTETATL I ES CTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDINS STKTRS 1GVGTLLSGHSGFDR 
PS AVKTKE SG VGQ INI NDN YLVGL KMRT I ACG PPQT iTVG LTAS R 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLA5QQ 
TLIAE^SELAEAFXSEPHSQMGSI^SQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKI TGS YLGYTCKCX5GLQSGS PLSSQTSQPE 
QEVGTSEGKP ISSLDAFPTQEGTLS PVNLTDDQI AAGLYACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVSIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVS SQKSA IPAMVGDYIAAFEA I 
SPDVLRYVINLADGNGNTALHYSVSHSNFEIVKLLLDADVCNVD 
HQNKAG YTP I MLAALAAVE AEK DMR I VEELFG CGD VNAKASQAG 
QTALMLAVSHGRIDMVXGLLACGADVNIQDDEGSTALMCASEHG 
HVEIVKLIJ^QPGa^GpLEDNEK3STALSIALEAGHKDIAVLLYA 
HVNFAXAQSPGTPRLGRKTS PGPTHRGSFD 


5418 


24 


1133 


SVPRAGGDMBTGAAELYDQALLG I LQHVGNVQDFLRVLFGFLYR 
ICTDFYRLLRHPSDRiyKSFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRJOEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQ P PG P VKEMAHGS QEAEAPGAVAGAAE VPR\ E P ? I 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQ VS VALS SS S I R VAMLEENG ERVLMEGKLTHKINTES S LWS L 
EPGKCVLVNLS KVGE YWWNAI LEGEEP I DIDKINKERSMATVDE 
EEQAVLDRLTFDYHQKLQGKPQSHELKVHEMLKKGWDAEGSPFR 
GQRFDPAMFNI S PGAVQF 


5419 


1395 ' 


259 


GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEESPFLGPRAAEEG" 

SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 

KDPASLPQC/LGP/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 

PORIOOWOOS PP T ARPHrtWT.T.P'OT'PDT7nr4Cai>n»DT nUMDnnov 

ELEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VALRHMERCYAK YESQTS FGSMYPTR IEGATRLFCDVYNPQS KT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHYCWEKLRRAEVDLER\mVWYKLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 ~ 


111 ~ ~ 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGPLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L^Leucine, M*Methionine, N-Asparagine, 
P=Proline, Q=Glutamine , R-Arginine, 
S«Serine, T=Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECIISTLLPATLYILCHI^RFKKPAEPTfVGMMKMPPSTRL/ 
LLELCTFTLAI ALGAVLLLPFS 1 1 SN3VLLSLPRNYYIQWLNGS 
LIHGLWNLVFLFSNLSLI FLMP FAYFFTES EG FAGSRKG VLG RV 
YETWMLMLLTLLVLGMVWVASAI VDXNKANR ES IiYDFWE Y YLP 
YL YS C I S FLG VLLIJj VCTPLGLARMFS VTGKLLVKPRLLE DLEE 
QLYCS AFEEAALTRR I CNPTSCWL PLDMELLHRQVLALQTQRVL 
IiEKRRKASAWQRNLGYPIAMLCLLVLTGLSVLIVAIHILELLID 
EAAM PRGMQGTSLGQVS FS XLGSFGAVIQWL I F YLMVS S WGF 
YS S PL FRS LR PRWHDTAMTQI IGNC VCLL VLS SALPVFSRT LGL 
TR FOLLGDFGR FN WLGNF Y I VFL YNAAFAGLTTLCLVKT FTAAV 
RAELIRAFGERE 


5421 


| 117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
EC 1 1 STLLFATLY I LCHI FLTRFKKPAEFTT\GMMKMPPS TRL/ 
LLELCTFTLA I ALGAVLLLPFS 1 1 SNEVLLSLPRNYYIQWLNGS 
LIHGLWNLVFL F SNLS L I FLM PFA YFFTES BG FAGSRKG VLGRV 
YETWMLMLLTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 
YLYSCISFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVTiALQTQRVL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 
EAAMPRGMQGTSLGQVS FS KLGSFGAVIQWLI FYLMVSSWGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDPGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5422 


3 


1263 


SCGESIiPTWLAGASRPGIGRKGGAWGGRGGSSPAQVIiLSPGPVF 
KAGCNWWHLSRDQAG VQRCDLGSSQP PPLGFKRFS CLSLPSS WD 
YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVF VSERELDWAKVMVEKS RMGWP PGTQVEQI.T .YAKKLYDSAF 
HPC3TGEKMNVIGRMSFQLPGGMIITGFMLQFYRTMPAVIFWQWV 
NQSFNALVNYTNRNAASPTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP PLVGRWVPFAAVAAAKCVNI PMMRQQELI KGI CVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERLEKLH 
FMQKVKVL/SAPLQVMLSGCFLIFMVPVACGLFPQKCELPVSYL 
EPKLQDTI KAKYGELEPYVYFNKGL 


5423 


3186 


905 


GVSMALGEE KAEAEAS EDTKAQS YGRGS CRER ELD I PGPMSGEQ "" 

PPRLEAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPOASD 

ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 

E FPQTLS LPRTT I CSGHDADTEDDPSLADLPQALDLS QQPHSS G 

LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 

GSLAKVSSSLEPWPQEPSSWGIiGPRPQWSPQPVFSGGDASGL 

GRRRLSFQAEYWACVLPDSLPPSPDRHSPLWNPNKEYEDLLDYT 

YPLRPGPQLPKKLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 

TNVS PNCP PAEATALP FSG PRE PSLKQWPS R VPQ KQGGMGLAS W 

SQIASTPRAPGSRDARWERREPALRGAKDRLriGKHLDMGSPQL 

RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 

LALPARLTQVS SLVS YLGS I STLVTLP TGD I KGQS PLE VSDS DG 

PASFPSS SSQSQLPPGAALQGSGDPEGQNPCFLRS FVRAHDSAG 

EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 

GGEQGKESLVQC \ VKTFG\ CQL EELICWLYNV\AD VTDHGTPAR 

SNLTS LK\ S S LQL Y RQ FKKD IDEHQSLTES VLQKGE ILLQCLLE 

NTPVLEDVLGRIAKQSGELESHADRLYDSILASLDMLAGCTLI P 

DKKPMAAMEHPCEGV 


" 5424 


3186 


905 


GVSMALGEEKAEAEAS EDTKAQS YGRGS CRERELD I PGPMSGEQ ' 
P PRLEAEGGL I S P W7GAEGI PAPTCW I GTD PGGPSRAHQPQAS D 
ANREP VAERS E PALSGL P PATMG SGDLLLS GES QVEKTKLS S S E 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCS ISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAE YWACVLPDSLPPS PDRHS PLWNPNKE YEDLLDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 
TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGIASW 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spon ding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cyeteine, D«Aspartic Acid, E= 
Glutamic Acid. F= Phenyl alanine, G«Glycine, 
H=Histidine, l=Isoleucine, K=rLysine, 
L«Leucine, M=Methionine, N»Aeparagine, 
P«Proline, Q=Glutamine, R«*Arginine, 
S= Serine, T-Threonine, V=Valine, 
W=*Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQLASTPRAPGSRDARWEiiRBPALRGAKDRLTIGraiLDMGSPQL 
RTRDRGW PS P R PER EKRTS QS ARRPTCTESR WKS EEEVES DDE Y 
LALPARLTQVS SL VS YLGS I S TLVTLPTGDI KGQS PLEVSDSDG 
PASPPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSPVRAHDSAG 
EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNLTSLK\SSLQLYRQFKXDIDEHQSLTESVLQKGBILLQCLLE 
NTPVLEDVI^RIAKQSGELESHADRLYDSILASLDMLAGCTLIP 
DKKPMAAMEHPCEGV 


5425 


1085 


115 


GFCPSPSLGHQPPRVLHPTMSMAVETFGFFMATVGLLMIiGVTLP 
NS Y WR VST VHGNVI TTJNT I FENLW FS CAT DS LGVYNCWE FPSML 
ALSGYIQACRALMITA1LLGFLGLLLGIAGLRCTNIGGLELSRK 
AKIjAATAGAPH\ ILPGICGMVAI \SWYAFNITR\DFSDPLYPGT 
KYELGPALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRP 
YQAPVSVMPVATSDQBGDSSFGKYGRNALRVAALCRGPRCLPTA 
PKKRG PGRGP FP YSNLRGRPRP VP VAPPRPRPRVLHSHG PSQAK 
NCSWEVAYLPSEAGSLI F 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDaP 
PAAHAK PD PG SGGQPAG PGAAGEALAVLT S FGRR LLV L I PVYLA 
GAVGLS VG FV LFGLAL YLG W RR VRDEKEH SLRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAET VAPAVRG S NPHLQT FTFTR VELGEKP LR I IGVKVHPGQR 
KEQ ILLDLNI S YVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINI^TOMTNIjLDIPGLSSLSD 
TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPIjPRGIIRIHL 
LAARGLSSKDKYVKGL I EGKSDPYALVRLGTQTFCSRVl DEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLQGGQGQVHIiRLKWIjSLZjSDAEKLEQVLQWNWG 
VS S RPD P PS AAI LWYLDRAQDLPMVTS EL YP PQLKKGNKE PNP 
MVQLS I QDVTQES KAVYSTNCPVWEEAFRFPLQDPQSQBLDVQV 
KDDSRALTLG ALTLPLARLLTAP EL ILDQW FQL S £ 5G PNSRL YM 
KLVMR ILYLDSSE I CFPTVPGCPGAWD VDSENPQR GSS VDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDL1AKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFlXjRCKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 
ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAALLSIYMERAED 
LPLRKGTKHLSPYATLTVGDSSHKTK7ISQTSAPVWDESASFLI 
RKPKTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGVEAKSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGOVXLTLWYYSE 
ERKLVS XVHGCRSLRQNGRDP PDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLS PE FNERFEWELPLDEAQRRKLDVS VKSNSS FMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP"" 

PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLIPVYLA 

GAVGLS VG FVLFGLALYLGWRRVRDE KE RSLRAARQLLDDE EQL 

TAKTL YMSHREL PAW VS FPDVE KAE WLNKI VAQVWPFLGQYMEX 

LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRI IGVKVHPGQR 

KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 

EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 

TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGIIRIHL 

LAARGLSSKDKYVKGIilEGKSDPYALVRLGTQTFCSRVIDEELN 

PQWGBTYEVMVHBVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 

LQAS VLDD WFPLQGGQGQVHLRLEWLS LL S DAEKLEQ VLQWNWG 

VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPOLKKGNKEPNP 

MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 

KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 

KLVMR ILYLDSSE ICFPTVPGCPGAWDVDSENPQRGSS VDAPPR 

PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 

VKLKLAGRS FRSHWREDLNPRWNEVFEVI VTSVPGQELEVB VF 

DKDLDKDDFLGRCKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, I-Isoleucine, K- Lysine, 
L=Leucine, M»Methicnine, N-Asparagine, 
P«Proline, Q=01ut amine, R-Arginine, 
SoSerine, T=Threonine, V=Valine, 
W«Tryptophan, Y»Tyxosine, X=UnJcnown, +=Stop 
Codon, /=possxble nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAAIiLSIVMERAEb" 
L PLRKGTKHLS P YATLT VGDSSH KTKTI S QTSAPVWDES AS FL I 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEBPELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ER KL VS I VHGCRSLRGNGRDPPD P YVSLLLLPDKNRGTKRRTS Q 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS . 


5428 
5429 


3 


1839 


SSRSSRLSACAIAPPWLVSSRPARPAQLQRPGXMVEDGAEELED ' 

LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 

LAAS I PYFHAMFTNDMMECKQDE I VMQGM DPS ALEAL I N FAYNG 

NLAI DQQNVQS LLMGAS FLQLQS I KDACCT FLRERLHPKNCLGV 

RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFIALPLEDVLEL 

VSRDELNVKSEEQVFEAALAWVRYDREQRGTFL\RNLQSNIRLL 

FCRPQFIiSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHLP 

AFRTRPRCCTS I AGL I YAVGGLN S AGDSLNWEV FDP I ANCWER 

CR PMTTAR SRVG VAVVNGLL YAI GG YDGQLRLS TVQAYNT E T DT 

WTRVGSMNS KRSAMGT WLDGQ I Y VCGG YDGNS SLSS VET YS PE 

TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 

YNHHTATWHPAAGMLNKRCRHGAASLGSKMFVCGGYDGSGFL5I 

AEMYSSV\ADQMCLIVPM\HTRR\SRVSLGGPAVGRLYAWJGVT 

TGQSKL\SSVGDVLTPETDCWTFM\APMACHEGGVGVGCIPLLT 


5430"" 


828 


202 


KREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDLPPTISLSDGEEPPPYQGPCTLQ 
LRDPEQQLELNRE S VRAP PNRTI FDS DLMDSARLGGPC PP S SNS 
GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 
LEGTRLHHTH IAPLES AAI WSKEKDKQKGHPL 


5431 


441 


1507 


QKRRKRRRKKIMKTIQPKMHNSISWAIFTG1AALCLFC2GVPVRS 
GDATFP KAMDNVTVRQGE S ATLRCT I DNRVTRVAWLNRST 1 LYA 
GlTOKWCLDPRVVLLS^TQYSIEIQNVDVYDEaPYTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRH I S P KAVG FVS EDE YLE I QGITREQSGDYE CS ASND V\ A 
APV\VRRVKVTVWYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRLI/EGKKGVKVENRPFLSKLIFFNVSEHDYGNYT 
^ASl^HTNASlMLFGPGAVSEVSNGTSRRAGCVWLLPLLVl, 


5432 


2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\~ 
LPGIT I NP \ TIAEG P S P \TS EGAS EANLVDLQKKLEELELDEQQ 
KKRLEAFI»TQBCAKVG ELKDDDF ER I SELGAGNGGWTKVQHR PS 
GLI MARKL I HLE I KPAI RNQ 1 1 RELQ VLHECNS P YI VGF YGAF Y 
S DGE I S I CMEHMDGG S LDQVLKE AKRI PEE I LGKVS IAVLRGLA 
YLREKHQ IMHRDVKPSNILVNSRGE IKLCDFGVSGQLI DSMANS 
F VGTRS YMAPERLQGTH YSVQS D I WSKGLSL VEIAVGR YPXP P P 
DAKELEAIFGRPWDGEEGEPHSISPRPRPPGRPVSGHGMDSRP 
AMAI FELLD YI VNB P P P KLPNG VFTPDFQE FVNKCL I KNPAERA 
DLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5433 


2 


1312 


AAAAPGSRRRKPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPVV~ 
LPGITINP\TIAEGPSP\TSBGASEANLVDLQKKLBELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQIIRELQVLHECNSPYIVGFYGAFY 
SDGEISICMEHMDGGSLDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YLREKHQ IMHRDVKPSNTLVNSRGEI KLCDFGVSGQLIDSMANS 
F VGTRS YMAPERLQGTH YS VQS D I WSMGLSLVE LA VGRYP I PPP 
DAKELEAIFGRPWDGEEGEPHSISPRPRPPGRPVSGKGMD9RP 
\MAI PELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLI KNPAERA 
PLKMLTNHTFIXR5EVEEVDFAGWLCKTLRLNQPQTPTRTAV 




360 


1885 

] 
I 


3VQEDKVGFEDPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV' 
jFGW PS LVF VFKNEDYF KDLOG PDAG P IGNATGQADC KAQDE RF 
5L I FTLGS FMNN FMTFP TG YI FDR FKTTVARL I A I FF YTTATLI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, OCysteine, D°Aspartic Acid, E=* 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
ff-Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, V=*Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, +«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








IAFTSAGSAVLLPLAMPMLTIGGILFLITNLQlCSNLFGQHRSfT^ 
ITLYNGAPDSSSAVFLIIKLLYEKGISLR/VIjLHLHLCLQYXiAC 
STHFPPDAPGAHPlPrAPQLQLWPVPWBWHHKGREXS/QQLSMKT 
GSYSQRSSFQRRKRPQGQGRSRNSAPSGATL/CSRRPAWHLVWL 
S VI QLWHYLF IGTLNS LLTNMAGGDMARVS TYTNAFAFTQ FG VL 
CAP WNGLLMDRLKQ KYQKEAR KTGS S TLAVALCSTVPS LALTSL 
LCLGFALCASVPILPLQYLTFILOVISRSFLYGSNAAFLTLAFP 
SEHFGKLFGIiVMALSAWSLLQFPlFTLIKGSLQNDPFYVNVMF 
MLAI LLTFFHPFLVYRE CRTW KES PS AXA 


5434 


66 


652 


R YAAL I 1 S L I QHKLLWRNQHCS R C V I MS PAQS AGLNWL F /GSGK " i 
HGPFU3CSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECEHTBLIDKPDETAITCPOCKTGHIiVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGBCPECHYPLLIEKKT 
AQGVKHFCASKQC33KPVSAE | 


5435 


4704 


1597 


PGDSSQRLAEMSNAKERKHAKKMRNQPTNVTLSSGFVADRGVKH 
! HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNBQSSSK 
GMFRKKGGWKAGPEGTSQEIPKYITASTFAQARAAEISAMLKAV 
TQKS SNSLV FQTLP RHMRRRAMSHNVKRLPRRLQE IAQKEAE KA 
VHQ KKEHS KNKCHKARRCHMNRTLEFNRRQ KKN I WLETH I WHAK 
R FHMVKKWG YCLGERPTVKSHRACYRAMTNRCLIjQDLS YYCCLE 
LKGKEEEILKALSGMCNIDTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPREMLGPVTFIWKSQRTPGDPSESRQLWIWIiHPTLKQDILEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTEIiPDEKIGKKRKR 
KDDG E NAKP I KKI IGDGTRDPCLP YSW IS PTTGI 1 1 SDLTMEMN 
RFRLI GPLSHS I LTEAI KAASVHTVGEDTEETPHRW WI ETCKKP 
DSVSLHCRQEAIFELLGGITSPAEIPAGTILGLTVGDPRINLPQ 
KKS KALPNP EKCQDNEKVRQULL EG VP VECTHS F I WNQD I CKS V 
TENKI SDQDLNRM RS ELLV PGSQL I LG PHES K I P I LLI QQ PG KV 
TG EDRLGWGS G WDVLLP KG WGMAFWI PFI YRGVRVGGLKESAVH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDL 
RRS E VPCAPMPKKTHQPSDEVGTS 1 EHPREAEEVMDAGCQESAG 
PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLS ILGKFPRALVWVSLSLLSKGSPE 
PHTMICVPAKEDFLQLHEDWHVCGPQESKHSOPFRSKILKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAGQBALTLQLWSGPtiPRVTli 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTOLLDMIiSSGPAAQ 
RGIiVLIiRPPASLQYRFARlAIEV f 


5436 


1781 


635 


ASDSIPWSEARTTRKIiAQRGCQWSLPERMPLVVFCGLPYSGKSR"| 

RAEELR VA1J\AEGRA VYVVDDAAVLGAEDPA VYG DS AREKALRG 

ALRASVERRLSRHDWILDSLNYIKGFRYELY\CLARAARTPLC 

LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 

GSS VLRELHTADS WNGSAQADVPKELEREESGAAE5 PALVTPD 

SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGIj 

EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQLDQVTS 

QVLAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAELSRLRR 

QFISYTKMHPNNENI.PQLANMFLQYLSQSLH 


5437 
5438" * 


739 


1^72 


CQEAASEFGGpi^TPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE | 
PRRVDSSS ENSGSDWDSAPETMEDVGHPKTXDSGALRVSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WDHVDSGGTRR PG VS PEGGL \G VPGPGAP LEKPGRR EKLLGWLR 
GEPGAPSRYLGGPEECLQISTNLTLHLLELLASALLALCSRPLR 
AAI1DTI1GLRGPLGLWLHGLLSFLAAI4HGLHAVLSLLTAHPLHFA 
CLFGtiLQAL VLAVS LREPNGDE AATDWE S EG1»E REGE EQRGD PG 
KGL 




2443 " 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDI^GKGRKEENKGSDRVsH 
IAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
DLHHYRNLSEFFRRKLKPQARP VCGX.HS VI SPSDGRI LNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFP PAASCDS FKNQLVT | 
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(~ SEQ 
1 ID 
NO: 


I Predicted* 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, CaCysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G -Glycine, 
H=Histidine, I*»Isoleucine, K=Lysine, 
L=>Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possxble nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWrVSHRRHFPGSXiMSVNP 
GMAR WIKE L FCHNER WLTGDWKKG PFS LTAVGAT \ NWGS I RI Y 
FDRDLHTNS PRHS KGS YNDFS FVTHTNREGVPMALRGEHLG /QS 
FNLGS TI VL I FEAPKD FNFQLKTGQKIRFGEALG S L 


5439 


2443 


1152 


tkprkrrhqpasOrOrpwssdstgdliargkgrkeenkgsdrvs 

LAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLAIiRRRLGQLSC 

msrpalklrswpltvlyyllpfgalrplsrvgwrpvsrvalyks 
vptrli^rawgrlnqvelphwlrrpvyslyiwtfgvnmkeaave 
dlhhyrnlseffrrklkpqarpvcglhsvispsdgrilnfgqvk 
nceveqvkgvtyslesflgprmctedlpfppaascdsfknqlvt 
regnelyhcviylapgdyhcfhsptdwtvshrrhfpgslmsvnp 
gmarw i kelfchn3rwltgdwkhgffsltavgat\nwgs i r x y 
fdrdlhtnsprhskgsyndfsfvthtnregvpmalrgehlg/qs 
fnlgsti vli feapkdpnfqlktgqkirfgealgsl 


5440 




253 


EP I P VTPDHRLVTMTH I V \QTFSPVNS \GQPPNYEMLKEEQEVA 

mlgaphnpappmstvihirsetsvpdhwwslfntlfmntcclg 
fiafa ys vks rdrkmvgdvtgaqayas takcln i w al» 1 1x3 1 fmt 
illiiipvlwqaqr 


" 5441 " 


2 


2054 


CRDGGKNGFIWSPMKPLEIKTQCSGPRMDPKICPADPAFFSFIN 
NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSBGLKTLRILYEEVDESBVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIALKLAEFQTDSQGKIVSTQEKE 
LVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLL 
P PAL FI PSTENEEQ \RLASARAVPRNVQP * WYEE VTNVWINVH 
DIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSQGYDW 
SEPFSPGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHIjYWS 
YEAAGEIVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPXVLFVYGGPQVQLVNNSFKG I KYLRLNTLASLG Y 
AVWI DGRGS CQRGLR FEGALKNQMGQVE I EDQ VEGLQFVAEK Y 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTG YTER YMD VPENNQHGYEAG S VALHVEKLPNE PNRL LI LH 
GFLDENVH FFHTNFLVS QLI RAGKP YQLQ VALPPVS PQI YPNER 
HSIRCPESGEHYEVTLLHFLQEYL 


5442 


X 


34 74 


CGQRSRRRS PDMPBAKPAAKKAP KG KDAPKGAPKEAP P KE APAE " 

APKEAPPEDQSPTAEEPTGVFLKKPDSVSVETGKDAWVAKVNG 

KELPDKPTIKWFKGKWLELGSKSGARFSFKESHNSASNVYTVEL 

HIGKWLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 

ESFKRTSEKKSDTAGELDFSGLLKKREWEEEKKKKKKDDDDLG 

IPPEIWELLKGAKKSEYEKIAFQYGITDLRGMLKRLKKAKVEVK 

KS AAFTKKLDPA YQVDRGNK 1 KLMVEI SDPDLTLKWFKNGQEI K 

PSSKYVFENVGKKRILTINKCTLADDAAYEVAVKDEKCFTELFV 

KEPPVLIVTPLEDQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEL 

TREDSFKARYRFKKDGKRHIL I FSDWQEDRGRYQ VI TNGGQCE 

AELIVEEKQLEVLQDIADLTVKASEQAVFKCEVSDEKVTGKWYK 

NGVEVRPSKRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 

GSLSAKLNFLEIKVEYVPKQ\EPPKIPLGFASGGKTSENAD/IV 

WAGNKLRLDV\SITGEAPSPFAT\WLKG\DEVFTTTEGRTRIE 

KRVDCSS FVIESAQREDEGRYTIKVTNPIGEDVAS IFLQWD VP 

DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSQ 

RWMKLNFEV FTETTY EST KMI EG ILYEMRVFAVNAI G VSQP SMN 

TK PFM P I APTS3PLHLI VEDVTDTTTTLKWR P PWR I GAGGI DG Y 

LVEYCLEGSEEWVPANTEPVERCGFTVKNLPTGARILFRVVGVN 

I AGRS EPATLAQP VT IRE I AE PP KI RL PRHLRQTY IRKVGEQLN 

LWPFQGKPRPQWWTKGGAPLDTSRVHVRTSDFDTVFFVRQAA 

RSDSGEYELSVQIENMKDTATIRIRWEKAGPPINVMVKEVWGT 

NALVEWQAPKDDGNSEimYFVQKADKKTMEWFNVYERNRHTSC 

TVSDLIVGNEYYFRVYTENICGLSDSPGVSKNTARILKTGITFK 

PPE YKEHDFRMAP KFLTPL I DRWVAG YSAALNCAVRGHPKPKV 

VWMKNKMEI REDPKFLI TWYQG VLTLWIRRPSPFDAGTYTCRAV 

NELGEALAECKLEVRVPQ 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Histidlne, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, NoAsparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T« Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=UnJcnown, **-Stop 
Codon, /^possible nucleotide deletion, 
\t=possible nucleotide insertion) 


5443 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSPRRSRSAAEPA 
MALSMPLNGLKEEDKE PLI ELFVKAGSDGESIGNCP FS QRLFM I 
LWLKGWFSVTTVDLKRKPADLQNLAPGTHPPFITFNSEVKTDV 
NKI EE FLEE VLCPPKYLKLS PKH PESNTAGMDI FAKFS AYIKNS 
RPEANEALERG tiLKTLQKLDEYLNS PLPDEIDENSMED I KFSTR 
KFLDG^MTIADCNLLPKLHIVKVVAKKYRNFDIPKEMTGIWRY 
LTNAYSRDE FTNTCPS DKEVE I \ AYSDVAKRLHQVKSRLIiKEVS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPrPDYTESDILRAY 
RAQKNLDFEDPY+DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
1 KVEAADMARAKALLGGPGEELBADTEYLDPFDAQPHPAP PDDG 
YME P YDAQWVMS ELPGRG VQL YDTPYEEQDPETAIX3PPSGQKPR 
QSRM PQEDER PAJDE YDQP WEWKKDHI S RAFAVQFDS PE WERTPG 
SAKELRRPPPRSPQPAERVDPALPIiEKQPWFHGPLNRADAESLIi 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFPS VPE^VLHYSSRPL P VQGAEHLALLY P WTQTP * Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLiHRERHPEGLr 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH \ 


5445 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVLLLTWARRRWRETRSRREPT 
TLRAQSVCPWWI*ETRMNRSIPVEVDESEPYPSQLLKPIPEYSP 
EEESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKLANH 
QRPV3RQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
AS ES WGALPAEHQFSFME KRNQWLVSQLS AAS PDTGHDSDKSD 
QS LPNASADS LGGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLG I RQLBR PLPLTS VC YPQDLP RPLRSREF PQFEPQRY P ACA 
QMLP PNLSPHAPWNYH YHCPGS PDHQ VPYGHD YPRAAYQQVTQP 
ALPGQPLPGASVRGLHP VQKVI LNYPSPWDQEERPAQRJDCS FPG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSKPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAID I FEDRIRGIDI IKWMER YLRDKTVM r I VAIS PKYKQ 
DVEGAESQLDED3HGLHTKYIHRMMQ IEFI KQGSMNFRFI PVLF 
PNAKKEHVPTWLQNTHVYS WPKNKKN I LLRLLRE EEYVAPPRGP 
LPTLQWPL 


5446' 


972 — 


161 


SSWSWCTGRMRKTRLWGLLWMDFVSELRAATKLTEEKYELKEGQ 
TLDVKCDYTLEKFASSQKAWQI IRDGEMPKTLACTBRPS KNSH P 
VQVGRI ILEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQP PKEPH 
MLF13RIRLWTKGFSGTPGSNENSTQNVYKIPPTTTKALCPLYT 
TPRTVTQAPPKS TADVS TPDS EINLTNVTD 1 1 R VP VFN 1 VI LLA 
GG FLS KS LVFSVL FAVTLRS FVP * AH E PTRMS SDFQPHPSGS CA 
KGGGRR 


5447 


207 


617 


MTARTLSLMASLVAYDDSDSEAETEHAGSFNATGQQKDTSGVAR 
PPGQDFASGTLDVPKAGAQPTKHGSCEDPGGYRLPIiAQLGRSDR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCVVPY 
TPRRLRQRQALS TETGKG KD VEPQGPPAGRAPAPIi YVG PG VS E F 
IQPYLNSHYKETTVPRKVLFHLRGHRGPVNTIQWCPVLSKSHML 
LSTSMD KTF KVWNAVDSGHCLQTYS LHTEAVRAARWAP CGRRIL 
SGGFDFAIiHLTDLETGTQLFSGRSDFRITTLKFHPXDHNIFLCG 
GFSSEMKAWDIRTGKVMRSYKATIQQTLDILFLREGSEFtiSSTD 
ASTRDSADRTI I AWDFRTSAKISNQI FHERFTCPSLALHPREPV 
FliAQTNGNYLALFSTVWPYRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTGS ADGRVLM YS PRTAS RACTLQGHTQAC VGTTYH P V LP 
SVLATCSWGGDMKIWH*AFHWLSLGEAIGDLAPARGYSGPGRSL 
KSPSPS KS LLVLLCGRAMFQ PATCPWQLPALS K 


5448 


194 


1833 


MAS KVTDAI VWYQKKIGAYDQQI WE KS VEQRE I KGLRNKPKXTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKG IVRWFFP FFF 
RWWLQVTSKVI FFWUjVLYLLQVAAI VLFCSTS SPHSI PLTEVT 
GPIWLMLIjLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 

hregdgssttdntqegavqnhgtstshsvgtvfrdlwhaaffls 
gskkaknsidkstetdngyvsldgkktvksgedgiqnhepqcet 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M^Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T« Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPS KDTQRT I TNVSDEVSSEEGPETG YSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSBTDVEKHQINPC 
VKKEYRDDPPHQSHLPWIiHSSEIPGLEKISAIVWEGNDCKKADMS 
VLE ISGMIMNRVNSHIPGIGYQI FGNAVSLI LGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMV1ISFWRVSLVWI 
FFFLLCVAERTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVX.WEDLLHCAECHSSCT 
SETDVENHQ INP CVKKE YRDD P FHQSHLPWLHS SH PGLE KI S AI 
VWEGNDCKKADMS VLEISGM I MNRVNSHI PGIGYQ I PGNAVSLI 
LGLTP FVFRLSQATDLEQLTAHSAS ELY VI AFGSNEDVI VLSMV 
I IS FWRVSLVWI FFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRlJKPKKtA 
HVKPDL IDVDLVRGS AFAKAXPES PWTS LTTKG I VRWFFP FFF 
RWWLQVTSKVI FFWLLVL YLLQ VAAI VL FCSTSS PHSIPLTEVI 
GPIWLMLLLGT VH CQI VS TRTP KP PLSTGGKRRR XLRKAAHLE V 
HREGDGSSTTDNTQBGAVQNHGTSTSHSVGTVFRDLWHAAFFDS 
GSKKAKNSIDKSTETDNGYVSLDGKKTVKSGEDGIQNHEPQCST 
I RPEETAWNTG TLRNGPS KDTQRTI TNVSDEVSSEEGPETG YSL 
RRHVDR TS EGVLRNRKSHH YK KHYPNEDAP KSG TSCS SRCS S S R 
QDSES ARPES ETEDVLWEDLLHCAE CHS SCTS B TD VENHQ I NPC 
VKKE YRDDP FHQSHL P WLHS S HPGLEK I S AI VW EGNDCK KADMS 
VLE IS GMIMNR VNSHI PG IG YQI FGNAVSL ILGLTPFVFRLSQA 
TDLEQLTAHSASELYVI APGSNEDV I VLSMVI I S FWRVS LVWI 
F FFLLCVAERT YKQ VG I M * TS EGVLRNRKSHHY KKHYPNEDA? K 
SGTSC3S RCSSSRQDS ESAR PE SETE D VL WEDLLHCAECHS S CT 
S ETDVENHQINPCVKKEYRDDPFHQSHLPWLHSS HPGLEK I SAI 
VWEGNDCKKADMSVLEI SGMt MNRVNSH I PGIGYQI FGNAVSLI 
LGLTP FVFRLS QAT DLEQLTAHS AS EL YVI AFG SNE DVI VLSMV 
1 1 SFWRVSLVWI FFFLLCVAERTYKQVGIM 


5450 


B136 


1242 


GQQFAS F FG * NHPE VT VAMALTDI DLQL QFSMSQ PE ALLLLAAG 

PADHLLLQLYSGHLQVRLVLGQE ELRLQTPAETLLSDS I PHT W 

LTWEG WATLSVDG FLNAS S AVPGAPLEVP YGL F VGGTGTLGL P 

YLRGTSR PLRG CLHAATLNGR S LLR PLTFDVHEGCAEEFSASDD 

VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 

RRGDF I Y VD I FEGH LRAWE KG QGTVLLHNS VP VADGQPHE VS V 

HINAHRLBISVDQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 

HLQEHRLGLTPEATNASLLGCMEDLSVNGQRRGLREALLTRNMA 

AGCRLBEEEYEDDAYGHYEAFSTLAPBAWPAMELPEPCVPEPGL 

PPVFANFTQLLTISPLWAEGGTAWLEWRHVQPTLDLMEAELRK 

SQVLFSVTRGAHYGEIiEIiDILGAQARKMFTLIiDVVNRKARFIHD 

GSEDTS DQL VLE VS VTARVPM P S CLRRGQTYLLP IQ VNPVNDP ? 

HIIFPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 

SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 

VSDGLQAS P PATLKWAI R PA I Q IHRS TGLRLAQGSAMP ILPAN 

LS VETNAVGQDVS VLFRVTGALQ FGELQKHS TGGVEGAE WWATQ 

AFHQRDVEQGRVRYLSTDPQHHAYDTVENIALEVQVGQEILSNL 

SFPVTIQRATVWMLRLEPLHTQNTQQETLTTAHLEATLEEAGPS 

PP T FH YE WQAPRKGNLQLQGTRLSDGQG FTQDD IQAGRVT YGA 

TARASEAVEDTFRFRVTAPPYFS PLYTFP IH IGGDPDAP VLTNV 

LL WPEGGEG VLS ADHL FVKS LNSAS YL YE VMERPRLGRLAWRG 

TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATRQGE 

S SGDMAWEE VRGVFRVAI QPVNDHAPVQT I S R I FHVARGGRRLL 

TTDDVAFS DADSGFADAQLVLTR KDLL FGS I VAVDEPTRPI YRF 

TOBDLRKRRVLFVHSGADRGWIQLQVSDGQHQATALLEVQASEP 

YLRVANGSSLWPQGGC3GTIDTAVLHLDTNLDIRSGDEVHYHVT 

AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLSPEDTMAF 

SVEAGPVHTDATLQVTIALEGPLAPLKLVRHKKIYVFQGEAAEI 

RRDQLEAAQEAVPPADIVFSVKSPPSAGYLVMVSRGALADEPPS 

LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 

GVLVELEVLPAAIPLEAQNFSVPEGGSLTLAPPLLRVSGPYFPT 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G«Glycine, 
H=Histidine, I-Isoleucine, K«Lysine, 
L=Leucine, M«Methionine, N^Asparagine, 
P»Proline, Q=Glutamine, R»Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKEDGPOARTLSAFSWRMVEEQLIRYV 
HDG SETLTDS FVLMANASBMDRQSHPVAFt VTVLPVNDQ PP I LT 
TNTGLQMWEGATAP I PABALRSTDGDSGSEDLVYTIEQPSNGRV 
VLRGAPGTEVRS FTQAQLDGGLVLFSHRGTLDGG FPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
S AGTD PQLIJjY RWRGPQLGRLFKAQQDSTGEALVN FTQ AE VYA 
GN I LYEHEMPPEPFWEAHDTLELQLSS PPARDVAATLAVAVS FE 
AACPQRPSHLWKNKGLWVPEGQRARITVAALDASNLLAS VPS PQ 
RSEHDVLFQVTQFPSRGQliLVSEEPLHAGQPHFIiQSQLAAGQLV 
Y AHGGGGTQQDG FHFRAHLQGPAGAS VAG PQTSE AFAIT VRDVN 
ERPPQPQASVPLRLTRGSRAPI5RAQLSWDPD5APGEIEYEVQ 
RAPHNG FLSLVGGGLGP VTRFTQADVD SGRLAFVANG S S VAG I F 
QLSMSDGAS P PLPMSLAVD I LPSAI EVQLRAPLE VPQALGRS SL 
SQ0QLRWSDREEPEAAYRLX(3GPQYGKLLVGGRPrSAFSQFQI 
D0/5EWFAFTNFSSSHDHFRVUUJ^GVNASA\rVWrVRA^ 
WAGG P W PQGAT&RLDPT VLDAG E LANRTGS VPRFR LLEG PRHGR 
WRVPRARTEPGGSQLVEQFTQQDLEOGRLGLEVGRPEGRAPGP 
AGOS LTLELWAQGVP PAVAS LDFATE P YNAARP YS VALL S VP EA 
ARTEAGKPES S TP TGEPG PMASSPEPAVAKGGFLS FLEANMFS V 
I IPMCL VLLLLAL I LPLLFYLRKRNKTGKHDVQVLTAKP RNG LA 
GDTET FRKVE PGQ AI PLTAVPGQG P PPGGQ PDPE1»LQFCRTPMP 
ALKNGQYWV 


5451 


l 


2274 


RDS S EQGRTGDTLGRPSACMDALKPPCLWRNHERGKKDRDSCGR 
KNSEPGSPHSLEALRDAAPSQGbNFLLLPTKMLFIFNFLFSPLP 
TPALI CILTFGAAI FLWLITRPQPVbPLLDLNNQS VGIEGGARK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNQ 
PYRWLS YKQVSDRAB YLGSCLLHKGYKS SPDQFVG I FAQNRPEW 
IISELACYTYSMVAVPLYDTLGPEAIVHIVNKADIAMVICDTPQ 
KALVLIGNVEKGFTPSLKVI XLMDPFDDDLKQRGE KSGI E ILSL 
YDAENLGKEHFRKPVPPSPEDLSVICFTSGTTGDPXGAMITHQN 
IVSNAAAFLKCVEHAYEPTPDDVAISYLPIiAHMFERIVQAWYS 
CGARVGFFQGD IRLLADDMKTLKPTLFP AVPRLLNRI YDKVQNE 
AKTPLKKFTJjKLAVSSKFKELQKGIIRHDSFWDKLIFAKIQDSIj 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPLACNYVKLEDVAI3MNYFTVNNEGEVCI KG 
TNVFKGYLKDPEKTQEALDSJDGWLHTGDIGRWLPNGTLKIIDRK 
KNIFKLAQGEY1APEKIEN1YNRSQPVLQIFVHGESLRSSLVGV 
WPDTDVLPSFAAKLGVKGSFEELCQNQWREAILEDLQKIGKE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


S452 


1833 


1138 


SRVPS LCLSLSLSL S PSREP VAGAPG CGTAGPPAMATL WGGLLR 
LGSLLSLSCLALSVLLLAQLSDAAKNFEDVRCKCICPPYKEWSG 
HIYNKNI SQKDCDCLHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTI IIYLS 1LGLUXYMVYLTLVEPILKRRLFGHAQLI QS 
DDDIGDHQPFTyNAHDVLARSRSRANVLNKVEYAQQRWKLQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 


PSIPAAVPQSAPPEPHREETVTATATSQVAQBPPAAAAPGBQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEABMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL 
RSWCRQILKQLQFbHTRTPPIIHRDhKCDNlFITGPTQSVKlQD 
LGLATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LitnATat i F i StCQNAAQI YRRVTSGVKPASFDKVAI PEVKEI I 
EGCIRQNKDERYS I KDLLNHAFFQEETG VRVELABEDDGfiKIAI 
KLWLRIEDIKKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKA I KDRVSL I KRKR EQRQL * 


5454 


111 


1520 


PS I PAAVP QS APPE PHRRBTVTATATSQ VAQQP PAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAXE 
PQEERS QQQDD I EE LE TKAVGMS NDGRFLKFD X EI GRG S FKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRF1CEEAEMLKGLQHPNI 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, B<= 
Glutamic Acid, Fw Phenyl alanine, G=Glycine, 
H»Histidine, I»Isoleucine, K«Lyeine, 
L«Leucine, M=Methionine, W-=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








WFYDSWESTVKGKKCIVLVTBLMTSGTLKTYLKRFKVMKIKVL 
RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVKIGD 
LQLATLKRAS FAKS VIGTPEFMAPEMYEEKYDES VDVYAFGMCM 
LBMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EGCIRQNKDER YS I KDLLNHAFFQEE7G VR VELAEEDDGE KIAI 
KLVTLRIEDIKKLKGKYKDNEAIEFSFDLBRNVPEDVAQEMVESG 
YVCEGDHKTMAfCAI KDR VS LIKRKREQRQL* 


5455 


1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAILPLLFGCU3VFGLFRLLQ 
WVRGKAYLRNAVWITGATSGLGKECAKVFYAAGAKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
•QCFGYVDILVNNAGISYRGTIMDTTVDVDKRVMETNYFGPVALT 
KALLPSMIKRROGHJVAISSIQGKMSIPFRSAYAASKHATQAFP 
DC LRAEM EQY E I E VT VI S PG YI HTNLS VNAITADGS RYG VMDTT 
TAQGRS P VE VAQDVLAAVGKKKKDVI uADLLPSLAVYLRTLAPG 
LFFSLMASRARKSRKSKNS 


5456 


2 


2332 


CGAGLVAAGAVLVLY PAS RAGE RTRVP3S PAPSSLPLHS PGACG 
TEVDMDPQRSPLLEVKGNIELKRPLIXAPSQLPLSGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSliTTVPQTQGQTTA 
QKVS KKTGPR CS TAIATGLXNQKP VPAVP VQKSG TSGVP PMAGG 
KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQENQQLQDQLR 
DACKJQVKALGTERTTLEGHIAKVQAQAEQGQQELKNLRACVLEL 
EBRLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 
ALS S S QAEVAS LRQETVAQAALLTERE ERLHGLEMERRRLHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEBIA 
MLVQSALDGYPVCIFAYGQTGSGKTFTMEGGPGGDPQLEGL I PR 
ALRHLFSVAQELSGQGWTYS FVASYVEIYNETVRDLLATGTRKG 
QGGECEI RRAGPGSEELTVTNARYVPVSCEKEVDALLHLARQNR 
AVARTAQNERS SRSHS VFQLQI SGEHS SRGLQCGAPLSLVDLAG 
S ER LD PG LALG PG ERERLRETQ AINSSLSTLGL VI MALSNKE S H 
VPYRNS KLTYLLQNSLGGSAKMLMFVN I SPLEENVS ESLNS LRF 
ASKVEPSVLFGTAQSNRKVJKTDPDIjCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIG WRAPCPRAI X 


5457 


2 


1540 


DDFVERRRWTRTTCLVRSPPHVPVCGHACSWNGGSLDPLKGTPA 
LLRSAERLMRKVKKLRLDKENTGSWRSFSIiNSEGAERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRS I IHGSRXYSGLI VNK 
APHD FQ F VQKTDE SG PHS HRLY YLGMP YGSRENS LL YS E I PKKV 
RKEALLLLS WKQMIiDHFQATPHHGVYS REEEIiLRER KRLG VFG I 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PLE I fCTQCSGPRMDPKICPADPAFFS FINNSDL WVANIETGE ER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGL KTLR I LYE EVDES E V E V I H VP S P ALEERKTDS Y R Y PRT 
GS KN PK I ALKLAE FQTDSQGKI VSTQEKEL VQP FS SLFP KVE YI 
ARAG WTRDGKYAWAMF LDRPQQWLQLVLLPPAL FI PSTENEEQA 
AShCQS CPQECPAVCG VRGGHQRLDQCS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSBEEEYARLVMEAQPEWLRAEV ' 
KR LS HELAE TTREK I Q AAEYG LA VLEEKHQLKLQ FEEL EVD YE A 
I RSEMEQLKEAFGQAHTNHKKVAADGESREES L I QES AS KEQ Y Y 
WKVLELQTELKQLRNVLTNTQSENERLASVAQBLKEINQNVEI 
QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 
VEFEGLKHEIKRLEEETBYLNSQLEDAIRLKEISERQLEEALET 
LKTEREQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKFSDDAA 
EPNNDAEALVNG FEHGGLAKLPLDNKTSTP KKEGLAP PS PS LVS 
DLLSELNISEIQKLKQQLMQMEREKAGLLATLQDTQKQLEHTRG 
SLSEQQEKVTRLTENLSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVD I NGP E I LACK YKVAVAEAGELREQLKALRS THEARE AQ 
HAEEKGR YEAEGQALTEKVS LLEKASRQDR ELLARLEJCELKKVS 
D VAGETQGS L$ VAQDEL VTF S EELANL YHHVCMCNNET PNRVML 
DYYREGQGGAGRTSPGGRTSPEARGRRSPILLPKGLLAPEAGRA 
DGGTGDSSPSPGSSLPSPLSDPRREPKNIYNLIAIIRDQIKHLQ 
AAVDRTTELSRQRIASQELGPAVDKDKEALMEEILKLKSLLSTK 



326 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO; 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, GoGlycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L=Leucine, M=*Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
Ss-Serine , T» Threonine , V«Val ine , 
W«Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








reqittlrtvi,icankotajevai^lkskyenekamvtetmmklr" 

NELKALKEDAATFSSIJIAMPATRCDEYITQLDEMORQIiAAAEDE 
KKTLNSLLRMAIQQKLALTQRLELLELDHEQTRRGRAKAAPKTK 
PATPSVSHTCACASDRABGTGLANQVPCSEKHSIYCD 


5459 


316 




RGGHRXjSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS 
KG PKRLE KFS DERAAYFRCYH K VTE LNNVKNVARLPKS TKKHA I 
GIYFNDDTSKTFACESDLEADEWCKVLQMECVGTRINDISLGEP 
DLLATG VERE QS ERFNVY LM PS PNLGC YMGECALQ ITYE YICLW 
DVQNPRVKLISWPLSALRRYGRDTTWPTFEAGRMCETGEGLFIF 
QTRDGEAI YQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 

SLSTMVPLPRSAYWQHITRQHSTGQLYRLQDVSSPIjKLHRTETF 
PAYRSEH 


5460 


45 


2097 


rpgcragelstgsrarervrnrvsapcgqdsrrcdpevlrgrsp 

GLGLAEMPSCX5ACTCGAAAVRLITSSLASAQRGISGGRIHMSVL 
GRLGTFETQI LQRAPLRS FTETPAY FASKDGI S KDGSGDGNKKS 
ASEGSSKKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CE KCHH FF WLS EADS KKS X I KE PE SAAB AVKLAFQQ KP PP pp K 
KI YNYLDK YWGQSFAXK VliS VA VYNH YKR I YNNIPANLR QQAB 
VBKQTSLTPREIiEIRRREDEYRFTKLLQI AGI S PHGNALGASMQ 
QQ VNQQI PQEKRGGE VLDSSHDD I KLE KSN I LLLGPTGSG KTLL 
AQTLAKCLDVPFAICDCTTLTQAG YVGEDI ES V IAKLLQDANYN 
VEKAQQGIVFLDEVDKIGSVPGIHQIiRDVGGEGVQQGLLKLLEG 
TI VNVPE KNS RKLRG ET VQVDTTN I LFVAS GAFNGLDR IIS RRK 
NEKYIXSFGTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRL 
LRHVEARDLIEFGMIPEFVGRLPVWPLHSLDEKTLVQILTEPR 
NAVIPQYO^FSHDKCEliNVTEDALKAXARIALERKTGARGLRS 
IMEKLLLEPMFEVPNSDIVCVEVDI03WEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5451 


1481 


160 


INPPPPPKSPCGRARKWRRRRRPGAPEAA^/MfiL^SGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLGIHVFXiVSCALPD 
SVLRRFWRTMCAVLGLVARQEDSGLRDHSVRVLISNHVTPFDH 
NI VNLLTTC S T PLLNSP P S F VCWS RG FMEMNGRGE LVES L KRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPZjTL 
QVQRPLVSVTVSDASWVSELLWSLFVPFTVYQVRWLRPVHRQLG 
EANEEFALRVQQLVAKELGQTGTRLTPADKAEHMKRQRHPRLRP 
QSAQSS FPPSPG.PS PDVQLATLAQRVKEVLPH VPLG VIQRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPSSGPV 
TPQPTALTFAKS S WARQES LQER KQAL YE YARRRFTERRAQ EAD 


" 5462 




3353 


KIKERQMSANNSPPSAQKSVLPTAIPAVLPAASPCSSPKTGLSA 

RLSNGSFSAPSLTNSRGSVHTVSFLLQIGLTRESVTIEAQELSL 

SAVKDLVCSIVYQKFPECGFFGMYDKILLFRHDMNSENILQLIT 

SADE IHEGDLVEWLSALATVEDFQ IRPHTLYVHS YKAPTFCDY 

CGEKLmLVRQGLKCEGCGLNYHKRCAFKIPNNCSGVRKRRLSN 

VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 

WMEKMVMCRVKVPHTFAVHSYTRPTICQYCKRLLKGLFRQGMQC 

KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 

NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 

SPSTSNWI PLMRVVQSIKHTKRKSSTMVKEGWMVHYTSRDNLRK 

RHYWRLDSKCLTLFQNESGSiCYYKEIPLSEILRISSPRDFTNIS 

QG SNPHCFE IITDTM V YF VGENNGDSSHNP VLAATG VGLD VAQS 

WEKAIRQALMPVTPQASVCTS PGQGKDHKDLSTS ISVSNCQIQE 

UVDISTVYQ IF ADBVLGSGQFGIVYGGKHRKTGRDVA IKVIDKM 

RFPTKQESQLRNEVAIXiQNLHHPGIVNLECMFETPERVFWMEK 

LHGDMLEMILSSEKSRLPERITKFMVTO I LVALRNT.H PKntvhp 

DLKPENVLLASAEPFPQVKLCDFGFARI IGEKSFRRSWGTPAY 

LAPEVLRS KG YNRS LDMWS VG VI t YVS LSGT PPFNEDEDINDQ I 

QNAAFMYPPNPWREISGEAIDLIJJNLLQVKMRKRYSVDKSLSHP 

WLQDYQTWLDLREPETRIGERYITHESDDARWEIHAYTHNLVYP 

KHFIMAPNPDDMEEDP 


"5463 


237 


1012 

■ 


LLS VTMTTS RCSHLPE VLP DCTS S AAP WKTVEDCGS LVNGQ PQ" " 
YVMQVSAKDGQLLSTWRTIiATQSPFNDRPMCRICHEGSSQEDL 



327 



WO 01/53312 



PCT/US00/34263 



seqT~ 

ID 
NO: 


Predicted 
beginning 
nuc^ <*r»fr* i 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

4. CS4. UUC Oi 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=<Glycine, 
H=Histidine, Iolsoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P« Proline, Q«Glutaraine, R=Arginine, 
S=Serine, T=Threonine , V«Valine, 
W= Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LS PCECTG TLGTI HRSCLEH WLSS SNTS YCE LCHFR FAV ERKPR 
PL VEWLRN PGPQHE XRTLFGDMVCFLF I TPLATI SGWLCLRGAV 
DHLHFSSRLEAVGL i altvalft i ylfwtlvs FR YHCRLYNEWR 
RTNQRV I LL I P KS VNVP SNQPS LLGLH S VKRNS KET W 




195 


" 677 


S PS MNPRKKVDLKIj 1 I VGAIG VGKTSLLHQYVHKTF YEE YQTTL 
GASILSKI I ILGDTTLKUJ1WDTGGQERVRSMVSTFYKGSDGCI 
LAFDVTDLESFEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5465 


5278 


3348 


KGD P RE F 1 R VH REAL E CD YVS AHLHEW I DL I FG Y KQQG PAAVBA 
VNVFHHLFYEGQVDIYNINDPLKETATIGFIKNPGQIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
XELXEPVG QI VCTDXGI LAVE QNXVIj I P PT WNKTFAWG YADLS C 
RLGTYES DKAMTVYECLS EWG Q I LCA I C PNPXLVI TGGTSTWC 
VWEMGTSKEKAKTVTLKQALLGHTDTVTCATASLAYHI IVSGSR 
DRTCIIWDLNKLSFLTQLRGHRAPVSALCINELTGDIVSCAGTY 
IHVWS INGNPI VS VNTFTGRSQQI I CCCMS EMNEWDTQNVI VTG 
HSDGWRFWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 
DEDSSDSEADEQS I S QDPKDTPS QPS STSHR PRAAS CRATAAKC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNPIEVRNYSRLKPGYRWERQLVFRSKLTMHTAFDRKDNAHPA 
E VTALG I S XDHS RI L VGDS RGR VFS WS VSDQPGRSAADHW VKDE 
GGDSCSGCS VRFSLTERRHHCRNCGQLFCQKCS RFQS E I KRLKI 
SSPVRVCQNCYYNLQHERGSEDGPRNC 


5456'- 


j 


992 


HACAHASAHASGRLVRWWRKRRSWGIQTSPVLLASLGVGLVTL 
LGLAVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGL 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHPPKDWAYSKGFVTADMIREHLPAPGDDVLVLLCGPPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GEALRVGTRGCRRDLPDPQARIFIQKKDLEEDESVTAAHLKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECLQPDQRTL 
YRDVMLENYSHLISLAGSS ISXPDVI TLLEQBXEPWMWRKETS 
RRYPDLELKYGPEKVSPENDTSEVNLPKQVIKQISTTLGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKMISYEKLPTHTPHASLICNT 
HKP YE CKECGK YFSCGSNL I QHQS IHTGEXPYXCKECGKAFQLH 
IQLTRHQKFHTGEKTFECKECGKAFNLPTQLNRHKNIHTVKKLF 
ECKECGKSFNRSSNLTQHQSIHAGVKPYQCKECGKAFNRGSNLI 
QHQKXHSNEKPFVCKECGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHQKIHTGEKPFECRECGKAFSLLNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FKRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQLIEHSRIHTO 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRS IHTGKKP F 
ECKECGKAFRLHMHLIRHQKLHTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


" 5468 " 


225 


2976 


SFLTDLFQSLAQLF^LCKQLYETTDTTTRLQAEKALVEFTNSPD ' 
CLSKCQLLLERGSSSYSQLLAATCLTKLVSRTWNPLPLEQRIDI * 
RNYVLN YLATR PXLATFVTQALIQLYAR I TKLGWFDCQKDD YVF 
RNA XTDVTRFLQDS VEYCI IGVTILSQLTNEINQVSATAFL I EA 
DTTHPLTKHRKIASSFRDSSLFDIFTLSCNLLKQASGKNLNLND 
ES QHGLLMQLLKLTHNCLNFDFIGTSTDESSDDLCTVQIPTS WR 

SASPMDIAVQEGRLTWLVYIIGAVIGGRVSFASTDEQDAMDGEL 
VCRVLQLMNLTDSRLAQAGNEKLELAMLSFFEQFRKIYIGDQVQ 
KSSKLYRRLSEVLGLNDETMVLSVFIGKIITNLKYWGRCEPITS 
KTLQLLNDLSIGYSSVRKLVKLSAVQFMLNNHTSEHFSFLGINN 
QSNLTDMRCRTTF YTALGRLLMVDLGEDEDQ YEQ FMLPLTAAFE 
AVAQMFSTNSFNBQEAKRTLVGLVRDLRGIAFAPNAKTSPMMLF 
EWIYPSYMPILQRAIELWYKDPACTTPVLKLMAELVHNRSQRLQ 
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SEQ 
ID 
NO : 


Predtcted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid. F= Phenylalanine, G»Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P«Proline, Q°Glutamine, R»Arginine, 
S-Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyroaine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion) 








FDVSSPNGlLu^ETSKMITMYGNRiLTI^EVPkDQVYALKLKG 
IS ICFSMLKAALSGS YVNFGVFRLYGDDALDNALQTFI KLLLS I 
PHSDLLD Y PKLSQS YYSLLEVLTQDHMN F IASLEPHVI M Y I LSS 
IS EGLTALDTM VCTG CCS CLDH I VTYLFKQLSRST KKR TT P LNQ 
ES DRFLH I MQQH PEM I QQMLST VLN III FEDCRNQWS MS R PLLG 
LILLNEKYFSDLRNSIVNSQPPEKQQAMHLCFENLWEGIERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2S53 


DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
EPTCVSDYMSISTCEWKMNGPTNCSTELRLLYQLVFLLSEAHTC 
VPENKGGAGCVCHLLMDDVVSADNYTLDLWAGQQLLWKGSFKPS 
EHVK PRAPGN1»TVHTNVSDTLLI*TWS NP YP PDN YLYNHLT YAVN 
IWS ENDPADFR I YNVT YLE PSIiR I AASTLKSG I S YRAR VR AWAQ 
CYN7TWSEWS PSTKWHNS Y REPFEQHLLLGVS VSCIVI LAVCLL 
CY VS ITK1KKEVJWDQ I PNPARS RL VAI I 1 QDAQG S QWE KRS RGQ 
EPAKCPHWKNCLTKI^PCFLEHNMKRDBDPHJCAAKEWPFQGSGK 
SAWCPVEISK7VLWPESISWRCVELPEAPVECE3EEEVEEEKG 
SFCASPESSRDDFQEGREGIVARLTESLFUDI»LGEENGGFCQQD 
W3ES(^LPPSGSTSAHMPWDEFPSAG?KEAPPWGXEQPLHLBPS 
PP AS PTQS PDNLTCTETPLVIAGN P AYRS FSNS LS QS PC PRELG 
PDPLLARHLEEVEPEMPCVPQLSEPTTVPQPEPETWEQILRRW 
IiQHGAAAAPVSAPTSGYQEFVHAVEQGGTQASAWGLGPPGEAG 
YKAFSSLLASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEQATDPLVDSLGSGIVYSALTCHLCGIILKQCHGQEDGG 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 
SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


547D- 


17 


1418 ' 


TACRIRTSI^RGIAAVKKOAVEMLASYGLAYSLMKFFTGPMSDF ' ' 
KNVGLVFVNS KRDRTKAVLCMWAGAIAAVFHTL I AYSDLG Y Y I 
INKLRHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
SFLVGCAS ISDVIAQWFVAILLHSHLECREPIiLI P I LSLYMGA 
LVRCTTLCT/3 YYKNIHD 1 1 PDRSGPELGGDATI RKMLS FW WPLA 
LIIATQRISRP I VNLFVSRDLGGSS AATEAVAI LTATYPVGHM P 
YGWLTEIRAVYPAFDKNNPSNKLVSTSNTVTAAHIKKFTFVCT4A 
LSLTLCFVMFWTPNVSEKI LIDI IGVDFAFAELCWPLRI FSFF 
P VP VTVRAHLTGW LMTLKKT FVLAP S S VLRI I VLI ASLWL P YL 
G VHGATLGVGS LLAGFVGE STMDAI AACYVYRKQKKKMENE S AT 
EGEDSAMTDMPPTEEVTDIVEMREENE 


5471 


18^8 


658 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV ' 
GPG VPGE VEMVKGQ PFDVG PRYTQLQY I GEGAYGMVS S AYDHVR 
KTRVAI KKI S PFEHQTYCQ RTLRE IQ I L LRFRHENV I G I RD I LR 
ASTLEAMRDVY I VQDLMETDL YKLLKS QQLSNDH I C YFL YQ I LR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGIARIADPEHD 
HTGFLTEYVATRWYRAPEiriLHSKGYTKSIDIWSVGCILAEMLS 
WRPIFPG KHYLDQLNHILGI LGSPSQBDLNCIINMKARNYLQSL 
PSFCTKVAWAKLFPKSDSKALDLLDRMLTFNPNKRITVEEALAHP 

YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFQ 
PGVLEAP 


54 72 " 


1469 


753 


LYVMAR YLSDEE VA VS I DRLCKANGR S PS I P FGTVR I PGRAR VR 

DPQALWI FG YGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 

GSDKMPGRWTLLEDHEGCTWGVAYQVQGEQVSKALKYLNVREA 

VLGGYDTKEVTFYPQDAPDQPLKAIAYVATPQNPGYLGPAPEEA 

IATQILACRGFSGHNLEYLLRVRDVMQLCGPQAQDEHJUAAIVDA 
VOTMT.PPFOPTPnnT.AT \r 


54 73 


3 


2113 


FMNVKLLIQDLEDIEQRVPVMDAQYKIITKTAHLiTKESPQEEfi" 

KEMFATMSKLKEQLTKVKECYSPLLYESQQLLIPLBELEKQMTS 

FYDS LG K IN E 1 1 T VLERE AQSS ALFKQKHQELLACQ ENCKKTLT 

LIEKGSQSVQKFVTLSNVLKI1FDQTRLQRQIADIHVAFQSMVKK 

TGDWKKHVETNSRLMKKFEESRAELEKVLRIAQEGIiEEKGDPEE 

LLRRHTEFFSQLDQRVLNAFLKACDELTDILPEQEQQGLQEAVR 

KLH KQW KDLQGEAP YHLLHLK I D VE KNRFLAS AEECRTELDRE T 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoiduc of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F» Phenyl alanine, G*Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 
L=Leucine, ^-Methionine, N«Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGSEKIIKEHRVPFSDKGPHHLCEKRLQLIEELCVKLPV 
RDPVRDTPGTCHVTLXELRAAIDSTYRKLMEDPDKWKDYTSRFS 
EFSSWISTNETQLKGIKGEAIDTANHGEVKRAVEEIRNGVTECRG 
ETLSWLKSRLKVLTEVSSENEAQKQGDELAKLSSSFKALVTLLS 
EVEKI4LSNFGDCVQYKEIVKNSLEELISGSKEVQEQAEKILDTE 
NL FEAQQLLLH HQQKTKR I SAKKRDVQQQ I AQAQQGEGGLPDRG 
HB EltR KLESTLDGLE RS RE RQERR I Q VTLRKWER FETNKETWR 
YLFQTGSSHERFLSFSSLESLSSELEQTKEFSKRTESIAVQAEN 
LVKEASEIPLGPQNKQLLQQQAKS I KEQ VKKLEDTLE BE YVIDK 
S 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKSGWLLRQSt'i"' 
LKRWKKNW FDLWSDGHL I YYDDQTRQN I EDKVHM PMDC INI RTG 
QECRDTQPPDGKSKDCMLQIVCRDGKTlSLCAESTDDCtiAWKFT 
LQDSRTNTAY VGS AVMTDETS WS S PP P YTAYAAPAPEVGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI IRERYRDNDSDLALGMLAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSSPCLLHSP2TFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNISIAVRKIALLLKPDKEIEHQGNHMTVRTL 
STFRWYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVPRKVR 


5476 


192 


1457 


SDSMSLU3CFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRAS EVLCSTNVSHYE LQVE IGRG FDNLTS VHLARHTP TGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYWTVFTVG 
SWLWVISPFMAYGSASQLLRTYFPEGMSETLIRNIjFGAVRGLN 
YLHQNGCIHRSIKASHILISGDGLVTLSGLSHLHSIiVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSGIGESVLVSSGTHTVNSDRLHTPSSKTFSPAFFSLVQLC 
LQQDPE KRPSASSLLSHVFFKQMKEESQDSILSLLPPAYNKPS I 
SLPPVLPWTEPECDFPDEKDSYWEF 


54 77 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFETGRQLLDEVEVATEPAGSRI 
VQEKVFKGLDLLEKAAEMLSQLDLFSRNEDLEEIASTDLKYLLV 
PAFQGALTM KQ VN PSKRLDHLQRAREH FIN YLTQ CHCYHVAE F3 
LPKTMNNSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 
HRLSAMKSAVESGQADDERVREYYLLHLQRWIDISLEEIESIDQ 
EIKILRERDSSREASTSNSSRQERPPVKPFILTRNMAQAKVFGA 
GYPSLPTMTVSDWYEQHRKYGALPDQGIAKAAPEEFRKAACXJQE 
EQEEKEEEDDEQTLKRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVR I tfVPNVKG ES TVFRAHTATVRS VHFCS DGQ S FVTASDDKT 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQH YQLHSAAVNGLS FHP SGN YL I TASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 

DIGDHGEVTKVPRPPATLASSMGNLTVSILEQRLTLEEDKLKQC 
LENQQL I MQRATP 


5479 


2 


835 


KTVRIWVPWKGESTVFRAHTATVRSVK FCSDGQS FVTASDDKT ~ 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHS AAVNGLS FHPSGNYL I TASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
D I GDHGEVTKVPRPPATLASSMGNLTVS I LEQRLTLEEDKLKQC 
LENQQL I MQRATP 


5480 


444 


1952 


LSLTSRMEEAETiVKriWT .n&TTnirD irTrMrwT'bri'vnr' v-rc«f?i-»T^r rm — ■ 
li -"-^*^uv twji\ij\jt\x llJIV^ftiyfin^oUAiCLKIfchDKLKH 

QHLKKKALREXWLLDG IS SGKEQEEMKKQNQQDQHQ I QVLEQS I 

LRLEKE I QDLEKAELQIS TKEEA I LKKLKS IERTTEDI IRSVKV 

EREERAEES IEDIYANI PDLPKS YI PSRLRKE INEE KEDDEQNR 

KALYAME I KVE KDLKTGES TVLS S I PL PSDDF KGTG I KVYDDGQ 

KSVYAVSSNHSAAYNGTDGLAPVEVEELLRQASERNSKSPTEYH 

BPVYA2&PFYR PTTPQRET VTPGPNFQER IKI KTNGLGI G VNES I 

HNMGNGLS E ERGNN FNH I S PI P PVPHPRS V I QQAEE KLHTPQKR 
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SEQ 
ID 
NO; 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
rcBiuue oz 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(AsAlanine, CoCysteine, D=Aspartic Acid, E=r 
Glutamic Acid, P= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K-Lysir.e, 
L=Leucine, M^Methionine, N=»Asparaoine, 
P=Proline, Q=Glutamine, R«Arginine # 
S=Serine, T« Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








LMTPWEESNVMQDKDAPSP"kPRtS PRBTI FGKSEHQNSSPTCQE 
DEEDVRYWIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
G YDG I IHAEL WI DDEE EEDEGEAEKPS YHP I APHSQ VYQPAKP 
TPLPRKRSEASPHEKHK3 


S481 


3 


1422 


NS PGS VCLTOCVCPS LLHCLPP LLLLLLL P LLIjHES PQ P PALR V " 
VATSSDRNFMNKHQKPVIjTGQRFKTRKRDEKEKPEPTVPRDTLV 

qglneagddleavakfldstgsrldyrryadtlfdilvagsmla 
pggtriddgdktkmtnhcvfsanedhetirnyaqvfnklirryk 

YLEKAFEDEMKKLLLFLKAFSETEQTKLAMLSGIIiLGNGTLPAT 

iltslftdslvkegiaasfavklfkawmaekdansvtsslrkan 

LDXRLLELPPVNRQSVDHFAKYFTDAGLKEIjSDFLRVQQSLGTR 
KELQKEIiQERLSOECP I KEWLYVKEEMKRNDIjPETAVIGLLWT 

cimnavewnkkeelvaeqalkhlkqyapllavfssqgqselilx* 
qkvqeycydnihfmkafqkiwlfykadvlseeailkwykeahv 
akgksvfldqmkkfvewlqnaeeesesegeen 


£482 




528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSIiVYSWPSRNLSLRI, " 
EGLQEKDSGP YSCS VNVQDKQGKSRGHS IKTLELNVLVPPAPPS 

crlqgvphvganvtlscqsprskpavqyqwdrqlpsfqtffapa 
ldvl rgslshtnls s s magvyvckahnevgtaqcnvtlevstg p 
gaawagawgtlvglgllaglvu.yhrrgkaleepandikeda 

I APRTLPW PKS SDT I S KNGTLSS VTS ARALRP PHGP PR PGALT P 

TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


"S483 


1 


788 


FFFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLQLRLTRA 
ENRIKQriETDSSEEISRYQEMIQKLQNVLBSERENCGLVSEQRIi 
KLQQENKQLRK3TESLRKIALE AQKKAKV K I S TMEHEFS I KERG 
FEVQLREMEDSNRNSIVEliRHLliATQQKAANRWKEETKKLTESA 
R I R I NN LKS E LS RQ KLHTQ EIXSQLEMANBKVAENE KL I LEHQ E 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 
ESDQDERGD3GQPSNKEI*FGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEG S E KAHS DDEKWGREDKSDQS DDEK IQNSDDEERAQGSDEDK 
LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 
SDNDDEKQNSDDEEQPQLSDBEKMQNSDDERPQASDEEHRHSDD 
EEKQDHKSESARGSDSEDEVLRMKRKNAIASDSEADSDTEVPKD 
NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
P1PETRIEVEIPKVNTDLGNDLYFVKLPNFLSVEPRPFDPQYYE 
DEFEDEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEIKESNAR 
IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 
VFKTKLTFRPHSTDSATHRKMTTiSLADRCSKTQKIRILPMAGRD 
PECQRTEM I KKEE ERLRAS I RRESQ QRRMR EKQHQRGLSAS YLE 
PDRYDEEEEGEESISLAAIKNRYKGGIREERARIYSSDSDEGSE 
EDKAQRLLKAKKLTSDE VR PNLFNSRGLS CTQE PTA LNEE LTDQ 
AGTN 


5485 


161 


1074 


KRK I IiS SMMDS EAHEKR P P I LTSS KQD I S PH ITNVGEMKH YLCG 
CCAAFNNVAITFPIQKVLFRQQLYGIKTRDAILQLRRDGFRNLY 
RGILPPLMQKTTTLALMFGLYEDLSCLLHKHVSAP E FATSGVAA 
VIAGTTEAIFTPLERVO^LLQDHKHHDKFTNTYQAFKALKCHGI 
GEYYRGLVPILFRNGLSNVLFFGLRGPIKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINWKTRIQSQIGGEFQSFPKVFQKI 
WLERDRKLINLFRGAHLNYHRSLISWGIINATYEFLLKVI 


5486 


1404 


142 


IPGSTISWSPAAARGLSVCRCCRr.HPASAMDTjjynnT.PirptrpG dd ~ 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GSLATSISQMVKTEGKGAKRKTSEEEKNGSEELVEKKVCKASSV 
IFGLKGYVAERKGEREEMQDAHVILNDITEECRPPSSLITRVSY 
FAVFDGHGGIRASKFAAQNLHQNL IRKFP KGDVI SVEKTVKRCL 
LDTFKHTDEEFLKQASSQKPAWKDGSTATCVIAVDNILYIANLG 
D8RAI LCRYNEBSQKHAALSLS KEHNPTQYEERMRIQKAGGNVR 
PGRVLGVLEVSRSIGDGQYKRCGVTSVPDIRRCQIiTPNDRFILL 



331 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co r r e s pond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, CsCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=G1 ut amine , R=Arginine, 
SsSerine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X^UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEAVNFILSCLEDEKIQTREGXSAADARYEAAC 
NRLANKAVQRGSADNVTVMWRIGH 


5487 


53S 


182 


AVSLEQIRGi^TPAPVPLPLQPCPSNCDMERVTLALLHAGLtA " 

LEANDPFANKDDPFYYDWKNLQLSGLICGGLLAIAGIAAVLSGK 

CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5488 


1072 


2S9 


AMAASGEPQRQWQEEVAAVVWGSCMTDLVSLTSRLPKTGETIH 
GHKFFIGFGGKGANQCVQA^RLCIAMTSMVCKVGKDSFGNDYIEN 
LKQND I STE FT YQT KDAATGTAS 1 1 VNNEGQNI I VI VAGANLLL 
NTEDLRAAANV ISRAKVMVCQLE ITPATSLEALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAE I LTGLTVGSAADAGE 
AAL VLL KRG CQ WI ITLG AEGCW LSQTE P E P KH I PTEKVKAVD 
TTVSFKI 


5489 ' 


81 


893 


GKG P VAAFI DQSNI FLTDPX I FLGQWR E E P KM PLLLLGE TE^LK 
LERDCRSP VE PWAAAS PDLAtACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEA>JKIDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDSF 
FS AKE EN 1 1 YS FLGLAP P PD S KG SE KA3EGGE TEAQKEGS EDVG 
NLPEAQEKNEEEGETATEET3EIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKG PVAAF IDQSNI FLTDPKI FLGQWR EEPKMPLLLLGETEPLK 
LERDCR S P VE ? WAAAS PDLAliACLCHCQDLSSGAFPNRG VLGG V 
LFPTVEMVIKVFVATSSGS IAIRKKQQBWGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRLSLGPTGAQARDFDWWARPPSRPYTQSKEDRPDTEGRSE 
. QGDMAS S FLP AG A I TGDSGGE LSSGDDSGEVE F PHS PE IE ETS C 
LAELFEKAAAHLQGLIQVASREQLLYLYARYKQVKVGNCNTPKP 
S FFD FEG KQKWE A WKALGDS S P SQAMQE YI AWKKLD PGWNPQI 
PEKKGKBANTGFGGPVISSLYHEETIREEDKNI FDYCRENNI DH 
ITKAI KS KNVDVNVKDEEGRALLHWACDRGHKELVTVLLQHRAD 
INCQDNEGQTALH YAS ACE FLD I VELLLQSGADPTLRDQDGCLP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


ASKNPLS AVCTTG IMSSLAVRDPAMDRSLRS VFVgNI: P YEfc'rtfii 
QLKDI FSE VGSWS FRLVYDRETGKPKG YGFCEYQDQETALS AM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAP I IDSPYGDP 
IDPEDAPES I TRAVASLPPEQMFELMKQMKLCVQNSHQEARNML 
LQNPQLAYALLQAQWMRIMDPEIALKILHRKIHVTPLIPGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNQQNPPAPQPQKLARRPVKDI 
PPEjMQTPIQGGI PAPGP I P AAVPGAGPGS LTPGGAMQ P QLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHEMRG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGP VPS SRGP 
MTGGIQGPGP INIGAGGP PQGPRQVPGI SGVGNPGAGMQGTG I Q 
GTGMQGAG I QGGGMQGAG IQGVS I QGGG IQGGGIQGAS KQGGSQ 
P SS FSPGQSQVTPQ DQE KAAL IMQ VLQLTADQ IAMLP P EQRQS I 
LILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPE3PRKPGRLTQALNSPLTWEHVWICVPGGTPDCL 
TDTFRVKRPHLRRSASNGHVPGTP VYRE KEDMYDEI IELKKSLH 
VQKSDVDLMRTXLRRLEEENSRKDRQIEQLLDPSRGTDFVRTLA 
EKRPDASWVINGLKQRILKLEQQCKEKDGTISKLQTDMKTTNLE 
EMRIAMETYYEEVHRIiQTLIiASSETTGKKPLGEKKTGAKRQKKM 
GSALLSLSRS VQELTEENQSLKEDLDRVLSTS PTI S KTQGYVEW 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 
QAKADLEKELECAREGEEERREREBVLREEIQTLTSKLQELQEM 
KKBEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 
RSPCSDGRRDAAARVLQAQWKWKHKJCKKAVLDEAAVVLQAAFR 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 

j location 

corresponding 

to first 
I amino acid 

residue of 
1 amino acid 
| sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, ^Cysteine, D=Aspartic Acid, K«- 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=.Proline, Q=Glutamine, R=Arginine, 
a-aerine, J.=inreomne, V»valine, 
W=Tryptophan, Y~ Tyrosine, X=Unknowr. f *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








OilLlKiKljliASKAHObKPt'SVPGLPiXJilSFvPRVPSPIAQATGS 
P VQE EAI V 1 1 QS ALRAHLARARH SAT5KRTTTAAS TRRRS ASAT 
HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


j 71 


536 


RSKAKIGTPTREVPSTDMXVRRESSSSLTHRPAPSPATPRLLGT" 
RRVLLGVS EGTG CADAME LVL VFLCS LLAPMVLASAAEKEKEKD 
PFHYDYCJTLRIGGLVFAWliFSVGILLILSRRCKCSPNQKPRAP 
GDEEAQVENLITANATEPQKAEN 


5495 


273 ■ * 


il68 


DS LLLI QVDTM PFTLHLRS RL PSA I R S L I LQ KKPN IR NT S SMAG 
ELRPASLWLPRS LAPAFERFCQVNTGPLPLLGQSE PEKWMLPP 
QGAISETRMGHPOFWKYEFGACTGSIiASLBQySEQLKDMVAFFL 
GCSFSLEEALEKAGLPRRJDPAGHSQAGAYKTTVPCVTHAGFCCP 
LWTMRP I PKDKLEGL VRACCSIX3GE QGQ PVHMGDPELLG I KEL 
SKPAYGDAMVCPPGEVP VFWPSPLTSLGAVSS CETPLAFAS I PG 
CTVMTDLKDAKAPPGCLTPER1PEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSL3HARSVLITT 
GFPTHFNHEPPEETDGPPGAVALVAFLOALEKEVAIIVDQRAWN 
LHQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAFLCKNGDPQT 
PRFDHLVAIERAGRAADGNYYNARKMNIKHLVDP IDDLFLAAKK 
IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVSNWGGYALACALY1LYSCAVHSQYLRKAVGPSRAPGDQA 
WTOALPSVIKBEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMlQKIiVDVTTAQV 


5496 
" 5497 


3 


2408 


QDT KMHE I Y KGN I TPQLNKNTLKTS AATDVWAV Y FSQF W I D Y EG 

MKSGKGRPISPVDSFPLSIWICQPTRYAESQKEPQTCNQVSLNT 

SQSSSSDLAGRliKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLI1FI.HESLILLSE 

NLRKDVE AVTGS PASQTS I CI GI LLRSAELALLLH P VDQANTLK 

SPVSESVSPWPDYLPTENGDFIiSSKRXQISRDINRIRSVTVNH 

MSDKRSMSVDLS HI PLKDPLLFKSASDTNLQKGI S FMDYLS DKH 

LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYR3 

DSN I LS FDS DGNQNILSSTLTS KGNETI ESI FKAEDLLPEAASL 

SENLDISKEETPPVRTLKSQSSLSGKPKERCPPNLAPLCVSYKN 

MKRSSS QMS LDT I S LD SM I LE EQLLESDGS DSHMFLEKGNKKNS 

TTNYRGTAESVNAGANLQNYGETSPDAISTNSEGAQENHDDLMS 

VWFKITGVNGE IDIRGEDTE ICLGVNQVTPDQLGN I SLRHYLC 

NRPVGSDQKAVIHSK3SPEISLRFESGPGAVIHSLLAEKNGFLQ 

CHI KNFSTE FLTSS LMN IQHFLEDETVATVMPMKI QVS NTKINL 

KDDS P RS S T VSLEPAP VTVHIDHLWERS DDGS FH I RDS HMLNT 

GNDLKENVKSDSVLLTSGKYDLKKQRSVTQATQTSPGVPWPSQS 

ANFPEFSFDFTREQLMEENESLKQELAKAKMAIiAEAHLEKDALIj 

HHIKKMTVE 




1821 


3308 


SISKLLKRRSNIDAYLLSNSCAFFAPRLFSI.ASQIIREQQSPNV"" 
CFIYKYSGFPSLECQCHFVSPHSSCYINFFSFPPPFFVCFQLSN 
G FSHYS LS S ESHVGPTGAGLFPHCLPASRLLPRVTS VHL PD YAH 
Y YTI GPGM FPSSQ I PS W KDWAKPG PYDQPLVNTLQRRKE KRE PD 
PNGGGPTTASGPPAAABEAQRPRSMTVSAATRPGEEMEACEELA 
LALS RGLQLDTQRS S RDS kQCS SG YSTQTTTP CCS EDT I PSQVS 
DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
S TAGLP TTLGPAMVTPGVATIRRTPSTKPS VRRGT IGAG P I P I K 
T P V I PVKTPT V PDL PG VLPA P PDGP EERGERS P E S PS VGEGPQG 
VTSM PS S MWSGQAS VNP PLPGPKPS I PE EHRQAI PES EAEDQER 

EPPSATVSPGQIPESDPADLSPRDTPQGEDMLNAIRRGVKLKKT 
TTNDRSAPRFS 


5498 f" 


' " $434 


1492 

1 


ILTHQEIFTGEKPCECGKASIQMSHLSQQKIYSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKPFECNECGKAFSQKQYVIKHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 
□KSNLIRHQRTHTGEKPFVCKECGKT FSGKSNLTEHEKI HI GEK 
PFKCSECGTAFGQKKYLIKHQNIHTGEKPYECNECGKAFSQRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
SIECX5KAFSQFSTLALHLRIHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
LoLeucine, M=Methionine, N^Asparagine , 
P-Proline, Q^Glutamine, R=Arginine, 
S=Serine, T~ Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
QK1HTH — — 


5499 " 


324 


926 


GFGQIGRGHK1TTYPFS PRKSGRKGMAQSQGWVKRYI KAFCKGF 
FVAVPVAVTFIoDRVACVARVEGASMQPSXiNPGGSQSSDWLLNH 
WKVRNFEVHRGDIVSLVSPKNPEQKIIKRVIALEGDIVRTIGHK 
NRYVKYPRGH I WVEGDHHGHSFDSNSFGPVSLGLLHAHATHI LW 
PPERWQKLESVLPPERLPVQREEE 


5500 


1978 


1286 


KPDWRLQNLP PRLYLWRS S RFGFGHLKKRLQMDFKI EHTWDG FP 
VKHB P VFIRLNPGD RG VMMD I SAP F FRDP P AP LG E PGKP FNELW 
DYEWEAFFLND I TEQYLE VELCPHGQHLVLLLSGRRNVWKQEL 
P LS FR VS RGETKWEGKAYL PWS Y F P PNVTKFNS FAI HGS KDKRS 

YEAI/YPVPQHBLQQGQKPDFHCLEYFKSFNFNTLIiGEEWKQPSS 
DLWLIEKCDI 


" 5501 


2927 


2226 


CRP P VS ARVAPGHQGAVGGSGRRPARVE WDAAAR P SSRP FS LP 
AA1MLAL I SRLLD W FRSL FWKEEMELTLVGLC YSG KTTF VNVI A 
SGQFSEDMIPTVGFN^KVTKGNVTIKIWDIGGQPRFRSMWERY 
CRGVNAIVYMIDAADREK1EASRNELHNLLDKPQLQGIPVLVLG 
N KRDL PNALDEKQLI EKMNLSAIQDRE I CC YS I S CKEKDNI D I T 
LQWL I QHS KSRRS 


5502 


3 


824 


NSAFPVWVPERTALLTCPIK^PGSSREAPGIAGPP^STAMSW, 
GKFFKGGG5 S KSRAAPS PQEALVRLRETEEMLG KKQE YLE N R I Q 
REIAIAKKHGTQNKRAALQALKRKKRFEKQLTQIDGTLSTIEFQ 
REALENSHT^EVLRNMGFAAKAMKSVHENMDLNKIDDLMQEIT 
EQQDIAQEI3EAFSQRVGFGDDFDEDELMAELEELEQEELNKKM 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 
1 W04 


216 . 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFLLPKLTSKKfeVDQAlKSTA 
EKVLVLRFGRDEDP VCLQL DD I L S KTS S DLS KMAAI YL VDVDQT 
AVYTQYFDISYI PSTVFFFNGQHMKV0YGGEDPALRSIKAVRRT 
SPAGTLG8KPVKS 




58 


3563 


QLS FSFQAP VTFDD I TVYLLQE E WVLLSQQQ KELCGSNKLVAPL 

GPTVANPELFRKFGRGPEPWLGSVQGQRSLLEHHPGKKQMGYMG 

EMEVQGPTRESGQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 

LKPRS IQKS WFVQ FPWL IMNEEQTALFCSACREYPS IRDKRSRL 

I EG YTG P FKVETLKYMAKS KAHMFCVNAIiAARDP I WAARFRS 1 R 

DPPGDVLASPEPLFTADCPIFYPPGPLGGFDSMAELLPSSRAEL 

EDPGGDGA I PAM YLDCI SDLRQKE I TDOXHSSSDINILYblDAVE 

SCIQDPSAEGLSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 

QKELYRDVMRMNY ELLAS LG PAAAKPDL I S KLERRAAP WI KD PN 

GPXWGKGRP PGNKKMVAVREADTQASAADS ALLFGS P VEARAS C 

CSSSICEEGDGPRRIKRTYRPRSrQRSWFGQFPWLVIDPKETKL 

FCSACIERPWLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 

NTVE1KEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 

DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILBD 

VRNS PCVS VLLDSS TDAS EQ ACVG I Y IR YFKQMEVKE S Y I TLAP 

LYSETADGYFETIVSALDELDIPFRKPGWWGLGTDGSAMLSCR 

GGLVEKFQE VI PQLLPVHCVAHRLHLAWDACGS IDLVKKCDRH 

IRTVFKFYQSS2«CRLNELQEGAAPLBQBIIRLKDLNAVRWVASR 

RRTIiHALLVSWPALARHLQRVAEAGGQIGHRAKGNlLKLMRGFHF 

VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 

LRHQAGPKEEEFNASFKDGRLHGICLDKLEVAEQRFQADRERTV 

LTGIEYLQQRFDADRPPQLKNMEVFDTMAWPSGIEIiASFGNDDI 

IiNLARYFECSLPTGYSEEALLEEWLGLKTIAQHLPFSMLCKNAL 

AQHCRFPLLS KLMAWVCVPI STSCCERGPJf &MMP TPTnco tvt 

SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
CAQVPARSPASARLRKEEMGALYVEEPRTQKPPILPSREAAEVL 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCS P RS LSAAKI1S NRNNNKLPS NL PQLQNL I KRDPPAY I EE FLQ " 
QYNKYKSNVEIFKLQPNKPSKELAELVKFMAQISHCYPEYLSNF 
PQEVKDLLSCNHTVLDPDLRMTFCKALILLRNKNLINPSSLLEL 
F PEL FRCHDKLLR KTL YTH I VTD I KNINAKH KNN KVNWLQN FK 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E*» 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H^Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrcsine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAAKMSLDVMIELYRRNIWNDAKTVNVITTACPSK 
VTKXLVAALTPFIiGKDEDSKQDSDSESEDDGPTAKDLX,VQYATG 
KKSSKNKKKLEKAMKVLKKHRKKKKPEVFNFSAIKLIHDPQDFA 
EKLLKQLECCKERFEVKMMUWNHSRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKILLFAAQASHHLVPPEI IQSLLMTVANNFVTDK 
NSGE VMTVG INAIKEITARCPLAMTEELLQDIoAQYKTHKDKIIVM 
MSARTLIHL FRTLNPQMLQKKFRGKPTEAS I EARVQE YGE LDAK 
DY I PG AE VLE VEKEENAENDEDG WES TSLS EBEDADGEW I DVQH 
G SDEEQQE I S KKLNS M PMEERKAKAAA I STSR VLTQ ED FQKIRM 
AQMRKEIiDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERLHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 


l 


1531 


FRGDIiCGQRGGSAPGEGGSSAfopAPAfcPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWS PLEGGGDEELR PNP YVRFPYRWWAVW 
LAAF PS LGAGGETP EAP PE S WTQLWF FR FWNAAGY AS FMVPGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPLAPRT 
EAAETTPMWQALKLLFCATGIiQVSYLTWGVLQERVMTRSYGATA 
TSPGERFTDSQFLVLMNRVLALIVAGLSCVLCKQPRHGAPMYRY 
SFASLSNVLSSWCQYEALKFVS FPTQVLAKASKVI PVMLMGKLV 
SRRS YEHW EY LTATL I S IG VSM FLLS S G P E PRS S PATTLSGL IL 
LAGY I AFDSFTSN WQDALFA YKMS S VQMMFG VNFFS CLFTVGSL 
LEQGALLEGTRFMGRHSEFAAHALIiLSICSACGQLFIFYTIGQF 
GAAVFTI IMTLRQAFAI LLS CLLYGHTVTWGGLGVAWFAALL 
LRVYARGRLKQRGKKAVPVESPVQKV 


5507 


3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLE IGGFGTAAGKK " 
VAVAD VQPGPMRFHQDQLQ VLLVFTKEDNQ CNGFCRACE K AG FK 
CTVTKEAQAVLACFLDKHHD III I D HRNPRQLD A2 ALCRS IRSS 
KJ.-SENTVIVGWRRVDREELSVMPFISAGFTRRYVENPN1MACY 
NEIiLQLEFGEVRSOLKLRACNSVFTALENSBDAISITSEDRFIQ 
YANPAFETTMGYQSGELIGKELGEVPINEKKADLLDTINSCIRI 
GKEWQGIYYAKKKNGDNIQQNVKI IPVIGQGGKIRHYVSI IRVC 
NGNNKAEKIS B CVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 
EVSSQRRHSSMARIHSMT I EAPITKVIN1 INAAQESSPMP VTEA 
LDR VLE I LRTTEL YSPQFGAKDDDPHANDLVGGLMS DGLRRLSG 
NEYVLSTKNTQMVSSNriTPISLDDVPPRIARAMENEEYWDFDI 
FELEAATHNRPLI YLGLXMFARPGICEFLHCSES TLRS WLQI I E 
ANYHSSNP YHNSTHSADVLHATAYFLS KER I KETLDPIDE VAAL 
IAATIHDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGDDKCNIFKNMBRNDYRTLRQGI IDMVLATEMTKHFEHVN 
KPVKSINKPLATLEENGETDKNQEVINTMIiRTPENRTLIKRMIil 
KCADVSNPCRPLQYCIEWAARISEEYFSQTDEEKQQGLPWMPV 
FDRNTCS I P KSQI S FI DYF I TDMFDAWDAFVDLP D IjMQHLDNNF 
KYWKGLDEMKLRNLRPPPB. 


5508 


1151 




LSSVFSRRSASHFAVGCSMGPFLHYWYLSLDRI*FPASGLRGFPN ta 
VLKKVLVDOLVASPLLGVWYFLGLGCLEGQTVGESCQELREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRSPVPLTPPGCVALDTRAD 


5509 


1238 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKLLkQVDFLNWE 
VTDHNLHELRVLRRYRLQRREDYTRYNQLSRAVRELARRLRDLP 
ERDQFRVRASAALIiDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDWTDPAFLVTRSM 
EDFVTWVDSSKIKRHVLEYNEERDDFDLEA 


5510 


96 


1195 


PAGAHIiSSGSSEPLVEPGRGRVGARVKGERGLQASGSAPGRSKM" 
AEGERQPPPDSSEEAPPATQNFI I PKKEIHTVPDMGKWKRSQAY 
ADYIG F Z IjTLNEG VKG KKLTFE YRVSE AI E KLVALLNTLDRW I D 
ETPPVDQPSRFGNKAYRTWYAKLDEEAENLVATWPTHLAAAVP 
EVAVYLKESVGNSTRlDYGTGHEAAFAAFIjCCIiCKIGVLRVDDQ 
IAI^KVFNRYLEVMRKLQKTYRMEPAGSQGVWGLDDFQFIjPFI 
WGSS QL I DH P YLE PRHFVDE KAVNENHKD YMFLECI LF I TEM KT 

GPFAEHSNQLWNI s avpsws kvnqglirmykaeclekfpviqhf 

KFGSLLPIHPVTSG 
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SEQ 
ID 

NO : 


Predicted 
beginning 
uucisot. lcie 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=«Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G«Glycine, 
H=Histidine, I=Ieoleucine, K=* Lysine, 
I,=Leu cine, M=Methionine, N-Asparagine, 
P^Proline, QoGlut amine, R=Arginine, 
S»Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


S511 


276 


1980 


KLSHVLNLPPENLITSISAVPISQK^EVADFQLSVDSLLEKDND 
HSRPDIQVQAKRLAEKLRCDTWSEISTGQRTVNFKINRELLTK 
TVLQQVIEDGSKYGLKSELFSGLPQKKIWEFSSPNVAKKFHVG 
HLRST 1 1 GNF IANLKE ALGHQ VI RINYLGDWGMQ FGliLGTGFQIj 
FGYBEKLQSNPLQHLFEVYVQVNKEAADD3CSVAKAAQEFFQRLE 
LGDVQAliS LWQKF RDLS I E EY I RVYKRLG VY FDE YSGES F YRE K 

SQEVhKLhESKGLLLKTIKGTAVVDISGmDPSSICrvmSDGT 
SIjYATRDLAAAIDRMDKYNFDTMIYVTDKGQKKHFQQVFQMLKI 
MG YDWAERCQ HVP FG WQGMKTRRGDVTFLEDVLNE I QLRMLQN 
MASIKTTKELKNPQETAERVGLAALIIQDFKGLLLSDYKFSWDR 
VFQSRGDTGVFLQYTHARLHSLEETFGCGYLNDFNTACLQEPQS 
VS I LQHI>IiRFDE VLYKSS QDFQPRHI VS YliLTLSHLAAVAHKTL 
QIKDSPPEVAGARriHLFKAVRSVLANGMKLLGITPVCRM 


5512 


120 


1015 


DPSIiLLTITVTGVTVLVLVLKSMNSRRREPITLOTDPEAKYPLPL 
IEKEKISHNTRRFRFGLPSPDHVLGLPVGNYVQLLAKIDNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQY1ENMK 
iGETIFFRGPRGRLFYUGPGNLGIRPDQTSEPKKTLADHliGMIA 
GGTGITPMLQLIRKITKDPSDRTRMSIjIFANQTEEDILVRKELE 
EIARTHPDQFDLWYTIjDRPPIGMKYSSGFVTADMIKEHLPPPAX 
ST L I LVCG PP P bIQTAAHPNLE KLGYTQDM I FTY 


5513 


2 


837 


ARWRIiPSDSPRIPPAGAETPGRGSCRNYLPSSSPPPPEPSSFPS 
PPTSRGGPGSRD1WSDSEEESQDRQLKIWLGDGASGKTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQIWDIGGQTIG 
GKMLDKYIYGAQGVLLVYDITKYQSFENLEDWYTWKKVSEESE 
TQPXjVALVGNKIDliEHMRTIKPEKHljRFCQENGFSSHFVSAKTG 
DSVFLCFQKVAAEILGIKXjNKABIEOSQRWKADIVNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 

* 


449 


VNRPSWIMGNFRGHALPGTFFFllGLWWCTKSILKYICKKQKRT 
CYLGSKTLFYRLEILEGITIVGMALTGMAGEQFIPGGPHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNAIiFVEAFI FYNHTHGREMLD I FVHQI^VLVVFLTGJjVAFL 

eflvrnnvllellrsslilliiqgswffqigfvlyppsggpawdlm 
dhenilflticfcwhyavtivivgmnyafitwlvksrlkrlcss 
evgllknaereqeseeem 


5515 _ 


1572 


260 


FWLVGRGDCDPLLSVCLrtMPLYEGLGSGGEkTAWIDLGEAF 

tkcgfagetgprciipsvikragmpkpvrwqyninteelysyl 
kef i h i lyfrhl lvnprdrrwt iesvlcpshfretltrvlfky 
fevpsvu^shij^ltlginsamvldcgyreslvlpiyegip 
^cwgalplggkalhkeletqlleqctvdtsvakeqslpsvmg 

5VPEGVLEDIKARTCFVSDLKRGLKI0AAKFNIDGJMNERPSPPP 
MVDYPIiDGEKlLHILGSIRDSWEILFEQDNEEQSVATLILDSL 
IQCP IDTRKQLAENLWIGGTS MLPGFLHRLLAE IRYLVE KPKY 
KKALGT KTFR 1 HT P PAKAN CVAWLGGA I FGALQDILGSRSVSKE 
Y YNQTGR I PDWCS LNNP PLEMM FDVGKTQP PIWKRAFS TEK 




3 


f 


NSRE PPQAGPGPS PRKS PTASS FLFP WR PIA33FVWGAQGAQES 
iKAMWRVPGTTRRPVTGESPGMHRPEAMIiLLLTLALLGGPTWAG 
KMYGPGGGKYFSTTEDYDHEtTGLRVSVGI^DVKSVQVKLGDSW 
DVKLGALGGNTQEVTLQPGEYITKVFVAFQAFLRGMVMYTSKDR 
YFYFGKLDGQISSAYPSQEGQVLVGI YGQYQLU3I KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR ■ 


5517 


246 


499 


SEIYVAMRTDSSKMTDVESGVANFASSARAGRRNALPDIQSSAA ' 
TDGTSDLPLKLEALS VKEDAXEKDEKTTQDQLEK PQNE EK 


5518 


3 


1375 


FLKTVAQNYSSVTHLHSIGKSVKGRNLWVLWGRFPKEHRIGIP 
E FK.YVANMHGDETVGRELLLHLII) YLVTSDGKDPE I TNL INSTR 
IHIMPSMNPDGFSAVKKPDCYYSIGRENYNQYDLNRNFPDAFBY 
NNVSRQPET VAVWKWLKTETFVLSANLKGGAL VAS YPFDNGVQA 
TGALYSRSLTPDDDVFQYLAHTYASRNPNMKKGDECKNKMNFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKASL I BY I KQ VH LG VKGQVFDQNGN PLPNV I VE VQDRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
jiucicoLiae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

atiu.no aCJU. 

residue of 
amino acid 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F*Phenyl alanine, G=Glycine, 
HaHistidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q»Glutamine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W*Tryptophan, Y«Tyrosine, X=»Unknown, *«Stop 
Codon, /«*possible nucleotide deletion, 
\=possible nucleotide insertion) 








HICPYRTNKYGEYYLLLLPGSYI INVTVPGHDPHITK.VI IPEKS ' 
QN FS ALKKD I LLP FQGQLDS I P VSNPSC PM I PL YRNLP DHS AAT 
KPSLFLFLVSLLHIFFK 


5519 


87 


477 


I KS KLNQQVE VQESEWRLTEAKG PTMGKBSGWDSGRAAVAAWG 
GWAVGTVLVALSAMGFTSVGIAASSIAAKMMSTAAIANGGGVA 
AGS LVA I LQS VGAAGLS VTSKVI GGFAGTALGAWLGS PPSS 


5520 


117 


; 943 


PTEGRQKVLKTFTVPRSAIjAMTKTSTCIYHFLVLSWYTFLNYYI 
SQEGKDE VKPKILANGARWKYMTLLNLLLQTI F YGVTCLDDVLK 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFMILFLYNRDL 
IYPKVLDTVIPVWLNHAMHTFIFPITLABVVLRPHSYPSKKTGL 
TLLAAASIAYISRILWLYFETGTWVYPVFAKLSLLGtAAFFSLS 
YVF IAS I Y LLGE KLNHW KW VSVQ I LQRWRLES VGI C FQ W PDWKS 
PAKHQLVKNIR 


~* 5521" 


"" ' 545 — 


911 


K I LNMQ KS CEEN EG K PQNM PKAEE DR PLE DVPQ EAEGN PQ PSEE 
GVSQEAEGWPRGGPNQPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREE IRRVRNKFVMMHWKQRHSRSRP YPVCFRP 


- 5522 


1224 


63 7 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQG STDHRG V PGKPGR WTLVE DPAG CVWG VAY RLPVG KE EE VK 
AYLDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYLG 
PAPLEDIAEQIFNAAGPSGRNTEYLFELANSIRNLVPEEADEHL 
FALE KLVKERLEGKQNLNC I 


5523 


3 


1280 


SKGKKRMGS SMSAATARR PVFDDKEDVNFDHFQ I LRAIGKGS FG' 
KVCIVQKRDTEKMYAMKY1-4NKQQCIERDEVRNVFRELEILQEIE 
HVFLVNLWYSFQDEEDMFMWDLLLGGDLRYHLQQNVQFSEDTV 
RLYI CEMAJbALD YLRGQHI I HRDVKPDNILLDERGHAHLTDFNI 
ATI I KDGERATALSGTKPYMAPEI FHSFVNGGTG YSFE VTWWSV 
GVMAYELLRGWRPYDIHSSNAVBS LVQLFSTVS VQ YVPTWS KEM 
VALLR KLLTVNPEHRL S S LQ0VQAAPALAG VI» WDKLSE KR VE PG 
FVPNKGRLHCDPTFELEEMILESRPLHKKKKRLAKNKSRDNSRD 
S SQSENDYLQDCLDAI QQDFVI FNREKLKRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


5524 " 


85 


2318 


RERERDHR PG ES S QGQS GAGG CF PS PTMELRCGGLLF5SRFDSG 
KLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CABTE FENGNRS W FYFS VRGGMPGKLI KINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWERIRDRPTFEMTETQFVLSFVHRFVEGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
ELLCYSLDGI.RVDLLTITSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKRIFFLSSRVHPGETPSSFVFNGFLDFILRPDDPRAQTLR 
RXFVFKLIPMLNPDGWRGHYRTDSRGVNLNRQYLKPDAVLHPA 
IYGAXAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDIiEKAN 
NLQNEAQCGHSADRHNAEA WKQTEPAE Q KLNS VW I K PQQSAGLE 
ESAPDTIPPKESGVAYYVDLHGHASKRGCFMYGNSFSDESTQVE 
NMLYPKLISLNSAHFDFQGCaaFSEKNMYARDRRDGQSKEGSGRV 
AIYKASGI IHS Y TLE CN YNTGRS VNS I PAACHDNGRAS PPPPPA 
FPS R YT VEL FEQVGRAMA I AALDMAE CNPW PRI VLS EHSSLTNL 
RAWMLKHVRNSRGLSSTLNVG VNKKRGLRTPPKSHNGLP VS CS E 
NTLSRARS FSTGTS AGGS S S S QQNS PQM KNS PS FPFHGSR P AGL 
PGLGS SrQPCVTHRVLGPVRGKPVWEPLQHVFGCLGHCWGK 


5525 


105 


834 


sntldferhlfimgqqisdqtqlvinklpekvakhvtlVresgs 
ltye eflgrvae lndvtakvasgqekh ll fevqpgs ds s afwkv 
wrwctkinkssgiveasrimnlyqfiqlykditsqaagvlaq 
sstseepdensssvtscqaslwmgrvkqltdeeeccicmdgrad 

LILPCAHSFCOKCIDKWSDRHRNr'PT C P I,r>MTn»NF QWWCna i> 
TEDDMAN Y I LNMADEAGQPHRP 


5526 


3 


853 


RRPCNPVRAAKRTGAAARA PRGLEVTMI/RVAWRTLS LI RTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDPPPSTLLKDYQNVPGIEKVDDVVKRLLSLEMAMKKEMLKI 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEBHLEKHRKDK 
AHKRYLLMS IDQRKKMLKNLRNTNYDVFEKICWGLGIE YTFP PL 
YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK | 
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SEQ 
ID 
NO: 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P^Proline. Q=Glutamine, RoArginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X« Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 


565 


LbRKyLLHQNPLLLRHQPNRTCISFSATMKLXDTKiSRPKQSSCG " 

KFQTKGIKWGKWKEVKXDPNMPADGCMDDLVCFEELTDYQLVS 

PAKNPSSLFSKEAPKRKAQAVSEEEEEBEGKSSSPKKKIKliKKS 

KNVATEGTSTQKEPEVKDPELEAQGDDMVCDDPEAOEMTSENLV 

OTAP KKKKNKGKKGLEPSQSTAAKVPKKAKTW I PEVHDQKADVS 

AWKDIiFVPRPVLRALSPLGFSAPTPIQALTLAPAIRDKLDlLGA 

AETGSGKTLAFAIPMIHAVLQWQKRNAAPPPSNTEAPPGETRTE 

AGAKTRS PG KAEAESDAL PEDTV I ES EALPSDI AAEARAKTGGT 

VSDQALLFGDDDAGBGPSSLIREKPVPKQNENEEENLDKEQTGN 

LKQELDD KS ATCKA YP KR PL LGLVLTPTRE IAVQVKQH IDAVAR. 

FTGI KTAILVGGMSTQKQQRMLNRRPE I WATPGRLWELI KEKH 

YHLRNLRQLRCLWDEADRMVEKGHFAELSQijLEMLNDSQVNPK 

RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 

RGKPKVI DLTRNEATVE TLTETK I HCETDE KDFYLY YFLMQYPG 

RSLVFANSISCIKRLSGLLKVLDIMPLTLHACMHQKQRLRNLBQ 

FARLEDCVLLATDVAARGLDI PKVQHVI H YQVPRTS E I YVHRSG 

RTARATNEGLSLML1GPEDVINFKKIYKTLKKDEDIPLFPVQTK 

YMDWKER I RLARQ I E KS E YRNFQACLHNS W I EQAAAALE I ELE 

EDMYKGGKADQQEERRRQKQMKVLKKELRHLIjSQPLFTESQKTK 

YPTQSGKPPLLVSAPSKSESAIiSCLSKQKKKKTKKPKEPQPEQP 

QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLAT I S I TLRR YLRLGATMA KS KFE " 

YVRDPEADDTCLAHCWVVVRLDGRNFHRFAEKHNFAKPNDSRAL. 

QLM?KCAQTVMEELED1VIAYGQSDEYSFVFKRKTNWFKRRASK 

FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNCT 

LKDYXSWRQADCHINNLYNTVFWALIQQSGLTPVQAQGRIiQGTL 

AADKNE ILFSEFNINYNNE PPMYRKGTVLI WQKVDEVMTKEIKI* 

PTEMEGKKMAVTRTRTKPC KPSHLPRAPCLRW L 


5523 


48 


640 


TFRL VS AHLKTRXL INP EAAEK RWRD WDSRQG WLS VKMQR VSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVI1RFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAAMENPNLREIVEQCVLEPD 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDIiKPENWFFEKQGLVKLTDFGFSNK 
FQPGKKLTTS CGS LAYSAPEI LLGDEYDAPAVDI WSLGVI LFML 
VCG QPPFQEANDS EXLTMIMDCX YTV PSHVSKECKDL I TRMLQR 
DPKRRASLEE I ENHPWLQGVDPSPATKYNIPLVS YKNLSEEEHN 
S II QRMVLGDIADRDAI VEALETNRYNHITATYFLLABRI LREK 
QEXEIQTRSAS PSNI KAQFRQS WPTK IDVPQDLEDDLTATPLS H 
ATVPQS PARAADS VLWGHRSKGLCDSAKKDDLPELAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWJuRRKPS 
VTNRLTSRKSAPVIiNQ IFEEGESDDEFDMDENLPPKLSRLKMNI 
AS PGTVHKRYHRRKSQGRGS S CS S S ETS DDDS E SRRRLDKDS GF 
TYSWHRRDSSBGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
VZt&iabijAjN PTNT5GT I i^CLAGPSNSMQLASRSAGEIiVESIjEC 
LMS LCLGSQLHGS TKYI I DPQNGLS FSS VKVQEKS TWKMC I S ST 
GNAGQVPAVGGIKFFSDHMADTTTELERIKSKNLKNNVLQLPXiC 
EKTISVNIQRNPKEGLLCASSPASCCHVI 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELXRQSWRAVSRSPLEHGrVLFARLF' 
ALEPDHiPLFQYNCRQFSSPEnci>SSPEFLDHIRKVMLVIDAAV 
TNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


~"5?32 


3395 


1402 


SDWKWGKRKMIIEDETEFCGEBLLHSVLQCKSVFDVLDGEEMR" 

RARTRANPYEKIRGVFFLNRAAMKMANMDFVFDRMFTNPRDSYG 

KPL VKDREAELL Y FADVCAG PGGFS E YVLWRKKWHA KG FGMTIjK 

GPNDFKL.EDFYSASSBLFEPYYGEGGIDGDGDITRPENISAFRN 

FVUXNTDRKGVKFU4AIX3GFSVEGQENLQEILSKQLLLCQFLMA 

LSIVRTGGHFlCKTFDIiFTPFSVGIjVYLLYCCFERVCLFKPITS 

R P ANS ERYWCKGLKVG I DD VRD YLFAVN I KJLNQLRNTDSDVNL 
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SBQ 
ID 

WO : 


Predicted 
beginning . 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M»Methionine, N-Asparagine , 
P«Proline, Q«Glut amine, R=*Arginine, 
S*Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=» Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKALAKrHAPVQDXTl7 
SBPRQAEIRKECLRLWG IPDQARVAPSSSDP KSKPPELIQGTE I 
DIFSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRWIKUDLKTELPRDTLLSVEIVHELKGEGKAQRKI 
SAIHILDVLVLNGTDVREQHFNQRIQIiAEKFVKAVSKPSRPDMN 
PIRVKBVYRLEEMEKIFVRLEMKIIKGSSGTPKLSYTGRDDRHF 
VPMGLYIVRTVNEPWTMGFSKSFKKKFFYNKKTKDSTFDLPADS 
IAP FHI CYYGRLFWEWGDG IRVHDSQKPQDQDKLSKEDVLS FIQ 
MHRA 


5533 


94 


789 


MKERRAPQPWARCKLVLVODVQCGKTAMLQVLAKDCYPETYVP 
XVFEN YTACLETE EQRVE LS LWDTSGS PYYDNVRPLCYSDS DAV 
LLCFD I SRPBTVDSALKKWRTE ILDYCPSTRVLL IGCKTDLRTD 
LSTLMEL SHQKQAPI S YEQG CAI AKQLG PE I YLEG S AFTS EKS I 
HSIFRTASMLCLNKPSPLPQKSPVRSLSKRLLHLPSRSELISPT 
FKKEKAKXCSIM 


"5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVEKRYLAAGAVTLLS L YLLPG YGAS LLCN LIGFVYP AYAS IK 
MESPSKDDDTVVnJTYWVVYALFGLAEFFSDLLLSWFPFYYVGK 
CAFLLFCMAPRPWNGALMLYQRWRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPOPKDK 


5535 


1029 


332 


kSFMDSEARLCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
NLQKLKNSBRILTEAKQKMRELTVNIKMKEDLIKELIKTGNDAK 
SVSKQYTIjKVTKLEHDAEQAKVELTETQKQLQELENKDLSDVAM 

kvklqkefrxkvdaaklrvqvrokkqqdskklas ls ic3nekran 
eleqsvdhmkyqkiqlqrklqeenekrkqldavikrdqqkikvi 
lsyi pakynmkc 


5536 


942 


282 


AAATAASLSPRGCRLRTPSSDVSPSRA?PPSAAPLPTGRAQMSP 
SGRLCLLTIVGLILPTRGQTLKDTTSSSSADATIMDIQVPTRAP 
DAVYTELQPTSPTPTWPADETPQPQTQTQQLBGTDGPIjVTDPET 
HKSTKAAHPTDDTTTLSER PS PSTDVQTDPQTLKPSGPHBDDPF 
FYDEHTLRKRGLIiVAAVLFITGIIILTSGKCRQLSRLCRNHCR 


1 5537 


3 

i 


2391 


RARVSSPQLRVFRSGRPRRLRVIiRINRTSVALRLAGTGRFVAXT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLL TEHCTB AS FQKVI S RRHGS CDLENLHLR KRW KREECEG 
HNGCYDEKTFKYDQFDESSVESLFHQQILSSCAKSYNFDQYRKV 
FTHSSLLNQQEE I D I WGXHH I Y DKTSVLFRQVS TLNS YRNVFI G 
EKNYHCNNS EKTLNQSSS PKNHQENYFIjEKQYKCKE F3EVFLQS 
MHGQEKQEQSYKCNKCVEVCTQSLKHIOHQTIHIRENSYSYNKY 
DKDLSQSSNLRKQI I HNEEKP Y KCEKCGDSLNHSLKLTQHQ 1 1 P 
TEEKPYKWKECGKVFjn^CSLYLTKQQQIiyrGENLYKCKACSKS 
FTRSSNL I VHQRIHTGEKPYKCKECGKAFRCSS YLTKHKRI H?G 
E KPYKC KECGKAFNRS S CLTQHQTTHTGE KL YKCKVCS KS YARS 
SNLIMHQRVliTGEKPYKCKECGKVFSRSSCLTQHRXIHTGENLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 
CGKAFSYSSDVIQHRRIHTGQRPYKCEECX5KAFNYRSYLTTHQR 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
FSYRSYLTTHRRSHSGERPYKCEECXJKAFNSRSYLIAHQRSHTR 
EKL 


5538 


926 


161 


HSMMMKIPWGSIPVLMLLLLLGLIDISQAQLSCTGPPAIPGIPG 
IPGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGNP 
uKvurivuri t\jtr ruo t^j/ifijM fOi* MjUcjOU i KA I\J ivl A r o ATRTI 
NVPLRRDQTIRFl)HVlTNMNNNYEPRSGKFTCfCVPGLYYFTYHA 
SSRGNLCVNLMRGRERAQKWTPCDYAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKNSLLGMEGANS I FSGFLLFPDMEA 


S539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG " 
IVDGPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREK 
DEIYGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCS 
SDSFNEDIAAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, Fa Phenyl alanine, GsGlycine, 
HaHistidine, I-Isoleucine, K=Lysine, 
L-Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine f T=Threonine, V=Valine, 
""Tryptophan, Y=Tyrosine, X=Unknown, *«=stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








ELB KVHDLCDNFCHR Y I T CLKGKMP I DL VI EDRDGGCREDFED Y 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDG LDTS VAS P SSGGEDE DLDQERRRN KKRG I FPKVATN I M 
RAWLFQHLSHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMIDQSNRTGQGAAFS PEGQPIGGYTETQPHVAVRPPGS VGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKRBKDEI 
YGH PLFP LliALVFE KCELATCS PRDGAGAGLGTPPGGDVCSS DS 
FNEDNTAFAKQVRS3RPLFSSNPELDNLMIQAIQVLRFHLLELE 
KjGXMPIDLVIEDRDGGCREDFEDYPA5CPSLPDQNNrWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSYASPSSGGED 
EDLDQE PRRNKKRG I F PKVATN I MRAWLFQHLSH P Y PSEEQKKQ 
LAQDTGLTILQVNNWFINARRRIVQPMIDQSNRTGQGAAFSPEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


143 


1440 


P PI/3 AGAGVHARS PH PARR LP LTT AGVG&RAPDLLPT PWRQHRG " 

PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPKYPGIVD 

GPAAliASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 

YGHPLFPLLALVFE KCELATCS PRDGAGAGLGTPPGGDVCSS DS 

FNEDNTAFAKQVRSERPLFSSNPELDNLMIOAIQVLRFHLLELE 

KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 

SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 

EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 

LAQDTGLTILQVNNWFINAKRRIVQPMIDQSNRTGQGAAFSPEG 

QPIGGYTET3PHVAFRAPASVGDEFGTRKEEWHYL 


5542 


146 


1440 


PPU5AGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEQPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGP YGPHRPPQPLP PGLDSDGLKREKDB I 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLTILQVNNWFINARRRIVQPMIDQSNRTGQGAAF3PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5543 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP"" 

KRPFSDSGAFWSPERRPGVLEAPRRRPVPASFRAVPPKPTRVHG 

SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 

KESRARRGPRGPSAFIPVEEVLREGAESLEQHLGLEALMSSGRV 

DNLAWMGLHPDYFTS FW RLHYLLLHTDG PLASS WRHY I AI MAA 

ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSEINK 

LLAHR P WL I TKEHIQALLKTGEHTWS LAEL IQAL VLLTHCHS LS 

SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 

ESARDVEALMERMQQLQESLLRDEGTSQEEMESRFELBKSESLL 

VT PS AD I LE PS PH PDMLCF VE DPTFG Y EDFTRRG AQAP PTFRAQ 

DYTWEDHGYSLIQRLYPEGGQLLDEKFOAAYSLTYNTIAMHSGV 

DTSVLRRAIWNYIHCVFGI RYDDYDYGEVNQLLERNLKVY I KTV 

ACYPEKTTRRMYNLFWRHFRH^EKVHVJTLLIiLEARMQAALLYAL 

RAITRYMT 




1895 


514 


LGGLLGRQRLLLRMGAGRLGAPMERHGRASATS VS SAGEQAAGD 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QR VE S LRKKRPL F PWFGLD IGGTLVKLVYFE PKD I TAEEE EEEV 
ESLKSIRKYLTSNVAYGSTGIRDVHLELKDLTLCGRKGNLHFIR 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSBKCQK 
LPFDLKN PYPLLLVN IGSGVS ILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRDIYGGDYBRFG 
LPGWAVASS FGNMMSKEKREAVS KEDLARATLI T I TNNIGS 1 AR 
MCALN EN INQWF VGJIFLRINTIAMRLLA YALD YWS KGQLJCALF 
SEHEGYFGAVGALLELLK I p 



340 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lyeine, 
L« Leucine, M-Methionine, N-Asparagijie, 
F»Proline, Q»Glutamine, R=Arginine, 
SoSerine, ^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=*Unknown, *^stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLIX3LLIALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLHSIID1KYGSGSGQQSVTGVEASDDANSYWRIRG 
GSEGGCPRGSPVRCGOAVRLTHVLTGKNLHTHHFPSPLSNNQEV 
S AFG EtXSEG DDLDLWTVRCS GQHWEREAAVR FQHVGTS VPLS VT 
GEQ YGS P IRGQHE VHGM PS ANTHNTWKAMEG I F I KPS VE PS AGH 
DEL 


\ 5546 


1592 


146 


FVPRGGHSS MGQSGRSRHQKRARAQAQLRNLEAYAANPHS FVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCYAVAGP 
LGVTH FL I LS KTETNVYFKLMRLPGG PTLTFQVKKYSLVRDWS 
SLRRHRMHEQQFAHPPIiLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNTI KRCLL1DYNPDSQELDFRHYS I KWPVGASRGMK 
KLLQEKFPNMS RLQD1S ELLATGAGLS ESEAEPDGDHNITELPQ 
AVAGRGNMRAQQSAVRLTEIGPRMTLQLIKVQEGVGEGKVMFHS 
FVS KTE EELQ A I LE AKEKKLRLKAQRQ AQQAQNVQRKQEQRE AH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQ KFPKTKDKSQG AQARRGPRGAS RD GGRG RG RGR PGKR VA 


5547 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLR^EAYAANPriSFVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHFLILSKTETNVYFKLMRiPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQXJFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVH KVNLNT I KRCLLID YNPD SQE LD FRHYS I KWPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGliSESEAEPDGDHNlTELPQ 
A VAGRGNMRAQQSAVRLTE IGPRM TLQLIKVQEG VGEGKVMFHS 
FVS KT E EELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFPK'TKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


■ 5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPLARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLE EE IR1RSADDCKQ FREE FNSLPSGHIQGTFELAN KEEN 
REKNR YPNILPNDHSR VILSQLDG IPCSDYIHAS YIDG YKEKNK 
FIAAQGPKQETVNDFWRMVWEQKSATIVMLTNLKERKEEKCHQY 
WPDQGCWTYGNIRVCVEDCVVLVDYTIRKFCIQPQLPDGCKAPR 
LVSQLHFTSWPDFGVPFTP IGMLKFLKKVKTLNPVHAGP I WHC 
SAGVGRTGTFIVXDAM^SAMMHAEQKVDVFEFVSRrRNQRPQMVQ 
TDMQYTFIYG^LEYYLYGDTELDVSSLEKHLQTMHGTTTHFPK 
IGLEEEFRICLTNVRIMKENMRTGNLPANMKKARVIQIIPYDFNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFIATQGPLAHTVEDFW 
RMIWEWKSHTIVMLTEVQEREQDKCYQYWPTEGSVTHGEITIEI 
KNDTLSEAISIRDFLVTLNQPQARQEEQVRWRQFHFHGWPEIG 
I PAEGKGM IDL I AAVQKQQQQTGNHP I TVHCS AGAGRTGTF I AL 
SNILERVKAEGLLDVFQAVKSLRLQRPHMVQTLEOYEFCYKWQ 
DFIDI FSDYANFK 


5549 


915 


256 


FEATGGKRLAFKMAGTARHDREMAIQAKKKLTTATDPIERLRLQ 
CLARGSAGIKGLGRVFRIMDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTIDFNEFLLTLRPPMSRARKEVIMQAF 
RKLDKTGDGVITIEDLREVYNAKHHPKYQNGEWSEEQVFRKFLD 
NFDS PYD KDGLVT PE E FMNYYAGVSAS I DTDVYF 1 1 MMRTAWKL 


5550 


2364 


1210 


RKRKVFLK^RLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
SL1AFTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
TVAKKCQYVGADVLDLAET^ASADGLVYEPTVFDr.SPQQKEWQ 
RMLQLIQSRLQEEHSLQDV1FKSAFKSTSTALPPREDDSSQSPN 
ACR IHGHL YWKVAGNFHITVT3KAIPHPRGHAHLAALVNHES YN 
FSHRIDHLSPGBLVPAIINPLDGTEKIAIDHNQMFQYFITWPT 
KLHTYK I SADTHQFS VTERER I INHAAGSHGVSG I FMKYDLSSL 
MVTVTEEHMPFWQFFVRLCGIVGG JFSTTGMLHGIGKFI VEI I C 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 " 


211 - ' 


1700 


MQRDHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE 



341 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D»Aspartic Acid, 2= 
uiutdmic Mela., r^rnenyiaianine , G=Glycxne, 
H=*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamina, R=Arginine, 
S«Serine, T»Threonine, V* Valine, 
W«Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKIPAKRiFGDNFDPDFIK 
QRRAGLNEFIQNLVRYPELYNHPDVRAFLQMDSP3CHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
« u\jji^r x /iv is. v i/^iuu. v LiNRi\±,y KHIKAERNVLLKNVKH 
PFLVGLHYSFQTTEKLiYFVLDFVNGGELFFHLQRERSFPEHRAR 
FYAAEI ASALGYLHSI KIVYRDLKPEWILLDSVGHWLTDFGLC 
KEG I AI S DTTTTFCGT PE Y LAPE VI RKQP YDNT VDWWCLGAVL Y 
EMLYGLPPFYCRD VAEMYDNI LHKPLSLR PGVSLTAWS ILE ELL 
EKDRQNRLGAKEDFLE IQNHPFFESLS WADLVQKKI PPPFNPNV 

AGPDDIRNFDTAFTEETVPYSVCVSSDYSIVNASVLEADDAFVG 
FSYAPPSEDLFL 


""£552 


2746 


930 


IXSPAAGAAMGKKHKKHKAEWRSSyEDYADKPLEKPLKLVXKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
PVRACRTQPAENESTPIQQLLEHFLRQLQRKDPHGFFAFPVTDA 
I APGYSMI I KM PMDFGTMKDKI VANEY KS VTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKI LHAG FKMMSKQAALLGNBDTAVEEP VP 
tv V ^vgvbTAKKSKKPSREVlSCMFEPEGNACSLTDSTAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLLYSVVNTAEP 
DADEEETH P VDLSS LS S KLLPG FTTLG FXDERRNKVTFLSSATT 
ALSMQNNSVFGDLKSDEMELLYSAYGDBTGVQCAI4SI1QEFVKDA 
GSYSKKWDDLLDQITGGDHSRTLFQLKQRRNVPMKPPDEAKVG 
DTLGDSSSSVLEFMSMKSYPDVSVDISMLSSLGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDOHHL 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


" 5553 


74 


1095 


LAjttttHV xijvbKMUC:i>VAhHAKybPi''riVVTPJjLESWALSOVAGWP 
VFLKCENVQp SGSFK I RG IGH FCQEMAKKG CRHLVCS SGGNAG I 
AAAYAARKLG IPATI VL PESTSLQWORLOGEGAE VQLTGKVWD 
EANLRAQELAKRDGWENVPPFDH PL I WKGHAS LVQELKAVLRTP 
PGALVLAVGGGGLLAG WAGLLEVGWQH VPI IAMETHGAHCFNA 
A ITAGKLVTLPDI TS VAKS LGAKTVAARALBCMQVCKIHSEWE 
DTEAVSAVQQLLDDERMLVEPACGAALAAIYSGLLRRLQAEGCL 
PPSLTS WVI VCGGNNINSRELQALKTHLGQV 




166 


2318 


^GRTGGRGSLRPAENVCLTCKLSGAETRGLLCPALRTWIMKVL" 

GRS FFWVLFPVLPWAVQAVEHEEVAQRVI KLHRGRGVAAMQSRQ 

WVRDSCRKLSGLLRQKNAVLNKLKTAIGAVEKDVGLSDEEKLFQ 

VHTFE I FQKELN ES ENS VFQAWGLQRAIjQGD YKD VVNMKES SR 

QRLEALREAAIKEETEYMELLAAEKHQVEALKNMQHQNQSLSML 

DEILEDVRKAADRLEEBIEEHAFDDNKSVKGVNFEAVLRVEEEE 

ANSKQNITKREVEDDLGLSMLIDSQNNQYILTKPRDSTIPRADH 

HFIKDIVTIGMLSLPCGWLCTAIGLPTMFGYIICGVLLGPSGLN 

SIKSIVQVETLGEFGVFFTLFLVGLEFSPEKLRKVWKISLQGPC 

i . » x jj urixftT Vj1jJjWU«t.1jIjK x KJr lAJ^ V r I S TCuS uSS TPLVS RFLM 

GSARGDKEGDIDYSTVLLGf3LVTQDVQLGLFKAVMPrLIQAGAS 

ASSSIWEVLRILVLIGQILFSLAAVFLLCLVIKKYLIGPYYRK 

LHMES KGNKE I L I LG I SAFI FLMLTVTE LLDVSMELGCFLAGAL 

VSSQGPWTEEIAT3IEPIRDFLAIVFFASIGLHVFPTFVAYEL 

T VI1VFLTLSVVVMKFLI1AALVLSI1I LPRSSQYI KWI VSAGLAQV 

S3FSFVLG5RARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 

TRCVPRPERRSSL 


5555 
55*1* " 


212 
S835 


1425 
3346 


LSU?TRETPAPPRCEAASQGRVG't«lADAAAEEAVRSVWNRTRDR 
GTMAPQNlSTFCLLLLYLlGAVIAGRDFYKlhQVPRSASIKDJK 
KAYRKLALQLHPDRNPDDPQAQEKFQDLGAAYEVLSDSEKRKQY 
DTYGEEGLKDGHQSS HGD I FSHFFGD FG FM FGG TPRQQDRNI PR 
GSDI I VDLE VTLE E VYAGNFVEWRNKP VARQAPG KRKCNCRQ E 
MRTTQLGPGRFQMTQEVGCDECPWVKLVWEERTLEVEIEPGVRD 
GMEYPFIGEGEPHVDGEPGDLRFRIKWKHPIFERRGDDLYTNV 
TISLVESLVGFEMDITHLDGHKVHISRJ5KITRPGAKLWKKGEGL 
PNFDNNNI KGSLI ITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
VYM3LQGY 

RTRGMS KNCV PMEFEE YLLRM FQGTFYLLQKI TKDNNAHTVKS R | 



342 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
recidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, CsCysteine, D«Aspartic Acid, Ee 
Glutamic Acid, F*= Phenyl alanine, G=Glycine, 
H=Histidine r 1=1 soleu cine, K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T«Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
N^possible nucleotide insertion) 








LEELDES YIEKFTDFLRLFVS VHLRRIBS YSQFP WEFXtEEfK - 

YTFHQPTHEG YFSCLDI WTLFLD YLTS KI KSRLGDKEAVLNRYE 

DALVLLLTEVLNRIQFRYNQAQLEELDDETLDDDQQTEWQRYLR 

QS LE WAKVMELLPTHAFSTL PP VLQ DNLE VYLGLQQFI VTSGS 

GHR LNITAENDCRRLHCS LRDLS SLLQAVGRLAEYFIGDVFAAR 

FWDALTWERLVKVTLYGSQ I KLYNI ETAVPS VLKPDLIDVHAQ 

SIAALQAYSHWLAQYCSEVHRQNTQQFVTLISTTMDAITPLIST 

KVQDKLLLSACHLLVSLATTVRPVFLISIPAVQKVFNRITDASA 

LRLVDKAQVLVCRAI*SNI LLLPWPNLPENEQQWPVRSINHASLI 

SALSRDYRNLKPSAVAPQRKMPLDDTKLIIHQTLSVLEDIVENI 

SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDEMLSFFL 

TLFRGLRVQMG VPFT E Q 1 1 QTFLNMFTREQLAES ILHEGSTGCR 

WEKFLKILQVWQEPGQVFKPFLPSI IALCMEQVYP 1 1 AERPS 

PDVKAELFELLFRTIJIHNWRYFFKSTVIjASVQRGIAEEQMENEP 

QFSAIMOAFGQSFLQPDIHLFKQNLFYLETLNTKQKLYHKKIFR 

TAMLFQFVNVLLC2VLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 

AFLPEFLTSCDGVDANQKSVLGRNFKMDRVRRERGRAKRRAEWA 

RKPGTCAARRGHIEASGRGLCPPCSLAAAHEMPADLVL 


5557 


1712 


! 491 


VILGAGLRDKDMWI PWGLPRRLRLSALAGAGRFCILGSEAATR' 
KHLPARNH CGL5DS S PQLWPE PDFRNPPRXASKASLDFKR YVTD 
RRLAETLAQI YLGKPS RP PHLLLECNPGPGILTQALLEAGAKW 
ALESDKTFI PHLESLGKNLDGKLRVIHCDFFKLDPRSGGVI KPP 
AMSSRGLFKNLGIEAVPWTADIPLKVVGMFPSRGEKRALWKLAY 
DLYS CTS I YKFGRI E VNM FIGE KE FQ KLMADPGNPDL YHVLS VI 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRBLLDQLQQKLY 
LI QMI PRQNLFTRNLTPMNYN I FFHLL KHCFGR RS AT VIDHLRS 
LTPLDARDILMQIGKQED3KWNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 


RAGCrHPQVPADLGAPAEPRRPQKTCVCLIiQPQPGGQRGPTrMI 
TGVFSMRLWTPVGVLTSLAYCLHQRRVAIAELQEADGQCPVDRS 
LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHBTTLKGGMFAGQLTKVGMQQMFA 
U3ERLRKNYVEDIPFLSPTFNPQEVFIRSTNIFRNLESTRCLIA 
GLFQCQKEGPIIIHTDEADSEVIjYPNYQSCWSLRQRTRGRRQTA 
SLQPGISEDIiKKVKDRMGIDSSDKVDFFILliDNVAAEQAHNLPS 
CPMLKRFARMI EQRAVDTS LYI LP KEDRES LQMAVGP FLH I LKS 

nllkamdsatapdkirklylyaahdvtfipllmtlgifdhkwpp 
favdltmelyqhleskewfvqlyyhgkeqvprgcpdglcpldmf 
lnamsvytlspekyhalcsqtqvmbvgnee 


5559 


150 


1983 


plaatahfakmsrvakyrrqvsedpdidslletlspeemeelek 
eldvvdpdgsvpvglrqrnqrekqstgvynreamlnfceketkk 
lmqremsmde skqvetktdakngeergrdaskkalgprrdsdlg 
kepkrgglkksfsrdrdeaggksgekpkeeki IRG IDKGRVRAA 
vdkkeagkdgrgeeravatkkeeefckgsdrntglsrdkdkkree 
mkevakkeddekvkgerrntdtrkegekmkraggntdmkkedek 
vkrgtgntdtkkddekvkkneplhekeakddsktktpekqtpsg 
ptkps egpakvee eaa ps i fde plervknndpemts vnvnnsdc 
itne i lvrfte alefntvvxlfalantraddhvafaiaimlkan 
ktitslnldsnhitgkgilaifrallqnntltelrfhnqrhicg 
gkteme iakllkenttllklgyhfelagprmtvtnlls rnmdkq 
rqkrlqeqrqaqeakgekkdllevpkagavakgspkpspqpspk 

PS P KNS PKKGGAP AAP P P P P PPLAP PL I MENLKNSLSPATQRKM 
GDKVLPAQEKNS RDQLLAA IRS SNLKQLKKVE VPKLLQ 


5560 ' 


9 ' 


921 


o o v v cj r oftuo v a piauijo r 3U uyivryy LKJ r Li V JjJSijrjLjbAIiECVAM 
QQRIGEIVAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFLSSGDK 
IRFFFE RGVFDEKGNFLVP PEKS I NKI GHALHAHD P VFKS ITHS 
FKVQTLARSIiGLQMP WVQSM YIFKQPH FGGE VSPHQDAS FLYT 
EPLGRVLGVWIAVEDATLENGCLWFIPGSHTSGVSRRMVRAPVG 
S APGTS FLGS E P ARONS L F VPTP VQRGALVL IHGEWHKS KQNL 
SDRSRQAYTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


SSSl 


2175 


1775 


CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 

llULlcOl luc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K«Lysine, 
L=Leucine, M«Methionine, N°Asparagine , 
P-Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=* Valine, 
W=Tryptophan, Y*= Tyrosine, X«=Unknown, *=Stqp 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) ! 








QLLAPTyPSAPGVMNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 

QVYGGVTY YNPAQQQVQ PKPS PPRRT PQP VTI KPPPPEWSRGS 
S 


5562 




1365 


SSGKNDMAAAGAAGLVRGLKAGVLSQADYLNLVQCETriEDIikLri 
LQS 7D YGN FLANEAS PLTVS V IDDRLKE KMVVE FRHMRNHAYE P 
LASFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCIS EQDLDEMNI EI 
IRNTLYKAYLES FYKFCTLLGGTTADAMCP I LE FEADRRAFI IT 
INSFGTELSKEDRAKLPPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5563' 


342 


1355 


SS GKNDMAAAGAAGL VRGL&AG VLSQADYLNL VQCETLfiDL KLH 
^STDYGNPLANE AS PLTVS VIDDRLKEKMVVEFRHMRNHAYEP 
LASFLDFITYSYMlDNVILLITGTLHQRSIABLVPKaiPLGSPE 
QMEAVN I AQT PAELYN AI LVDTPIiAAF FQDCI S EQDLDEMN I E I 
IRNTLYKAYLES FYKFCTLLGGTTADAMCP I LEFEADRRAFI IT 
INSFGTELSKEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKUiFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNI VW I AECI AQRHRAKIDNYI PI P 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAOVGAWRTGALGLALLL 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSaiiC 
VPLTWRCDRDLDCSDGSDEBECRIEPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKLRNCSRLACLAGELRCTLSDDCIPLTWRCDGH 
PDCPDSSDELGCGTNEH.PEGDATTMGPPVTLESVT5LRNATTM 
GPP VTLES VPS VGNATSSSAGDQSGS PTAYG VI AAAA VLSAS LV 
TATLLLLSWLRAQERLRPLGLLVAMKESLLLSEQKTSLP 


5565 


993 


138 


RWNS PNPARAGS I SRPQRAPGS VSAVAMTAAVFFGCAFIAFGPA 
LALYVFT I ATE PLRI I PL I AGAFFWLVS LLI SSLVWFMAR VI ID 
NKDGPTQKYLLI FGAFVS VYIQEMFRFAYYKLLKKASBGLKS IN 
PGETAPSMRLLAYVSGLG FG IMSG VFSF VNTLSDS LG PGTVG IH 
GDSPQ FFLYS AFMTL VI ILLH VFWGI VF FDGCE KKKWG I LL I VL 
LTHLLVSAQTFISSYYGINLASAFIILVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAPVKMVSWMISRAWLVFGMLYPAYYSYKAVKT 
KNVKEYVRWMMYWIVFALYTVIETVADQTVAWFPLYYELKIAFV 
I WLLS PYTKGASL I YRKFLH PLLSSKERE I DDYI VQAXERGYET 
MVNFGRQGLNLAATAAVTAAVKSQGAI TERLRSFSMHDLTTIQG 
DEPVGQRP YQPL PEAKKKSKPAPSESAG YG I PLKDGDEKTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 
RPQVYF 


5567 


1554 


233 


EFLGSGVSPDLANEDGLTALHOCCIDDFREMVQOLLEAGANINA " 
CDS ECWTPLHAAATCGHLHLVELLIASGANLLAVNTDGNMPYDL 
CDDEQTLD CLETAMADRG ITQDS I EAARAVPELRMLDDIRSRLQ 
AGADLHAPLDHGATLLHVAAANGFSE AAALLLEHRAS LS AKDQD 
G WE PLHAAAYWGQVPLVELL VAHGADLNAKS LMDETPLDVCGD E 
EVRAKLLELKHKHDALLRAQSRQRSLLRRRTSSAGSRGKWRRV 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTGAELR 
P P PPEEDNPEWRPHNGRVGGS PVRHLYS KRLDRSVS YQLSPLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETAEP 
GLPGDTVTPQPDCGFRAGGDPPLLKLTAPAVEAPVERRPCCLLM 


5568 
5569 


1731 
2 


587 

835 


AJEDRQPAS RRCiAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL ' 
SLLVSGPRLFLLQQPLAPSGLTLKS EALRNWQVYRLVTYI FVYE 
NPISLLCGAI I IWRFAGNFERTVGTVRHCFFTVIFAIFSAIIFL 
S FEAVSS LS KLGE VEDARGFTP VAFAMLGVTTVRSRMRRALVFG 
M WPSVLVPWLLLGAS WLIPQTS FLSNVCGLS IGLAYGLTYCYS 
IDI^ERVALKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGSYPTQSCHPHLSPSHPVSQTQHASGQKLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNSPGTVYSGALGTPGAAGSKESSRVPMP 

QTPCPLAWERGSRSEDISVPGOKPPTCSSySGMbVGPSSLPHLGH 
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SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide* 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine, K-Lysine, 
LaLeucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








LKLLL LL t>L L P LRGQANTGCYG I PGMPGLPGAPG KDGYDGL PG P" 
KGEPG I PAI PG I RG PKGQ KG EPGLPGHPGKNG P MG P PGMPG VPG 
PMGIPGEPGEBGRYKQKFQSVFTVTRQTHQPPAPNSLIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVIiLYRSGVK 
WTFCGHTS KTNQ VNSGG VIiLRLQVGESVMLAVNDYYDM VG I QG 
SDSVPSGFLLFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSS PS PGKREMDTDWKL 1 ESKHBVTILGGLNEPWKPYGPQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPN1DEASGTVCI. 
DVINQTWTALYDLTNXFESFLPQLLAYPNPIDPLWGDAAAMYLH 
RPEBYKQKIKBYIQKYATEEAUKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGG VATS TE EPAR PRA PQSRG PG P VSQTGRGRERGGGDT 
MS S PS PGKRRMDTDWKL I ESKHEVTI LGGLNE FWKFYGPQGT 
P YEG G VWKVRVDLP DKYP FKS PS IG FMNKI FH PN 1 DE ASGTVCL 
D V INQTWTALYDLTNI FE SFLPQLLAY PNP I DPLNGDAAAM YLH 
RPEEYKQKIKBYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGI PGRRFRVMAAGDGDVKLGTLGSGSES SNDGGSES PG 
DAGAAAEGGGWAAAALALLTGGGEMLLNVALVALVLLGAYRLWV 
R»GRROhGAQAQAGEBSPATShPRMKKRDFShEQhRQYDGSRNP 
RILLAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGliATFCLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKOD 




2562 


219 


VPARTPNAEDQGPEARAATAT PCQSGGRERAGEAAED6VKMAAF 

semgvmp2 iaqaveemdwllptdiqaes ipl1lgggdvlmaaet 
gsgktg afs i p v i q i vyetlkdqqbg kxgktti ktgasvlnkwq 
mnpydrgsafaigsdglccqsrevkewhgcratkglmkgkhyyb 
vschdqglcr vg ws tmqas ldlgtdkfg fgfggtg kkshnkq fd 
nygeeftmhdtigcyldidkghvkfskngkdlglafeipphmkn 
oalfpacvlknaelkfnfgeeefkfppkdgfvalskapdgyivx 
sqhsgnaqvtqtkflpnapkalivepsrelaeotlnnikqfxky 

IDNPKIJIELLIIGGVAARDQLSVLENGVDIVVGTPGRLDDLVST 
GKLNLS Q VRFL VLDEADGLLS QG YSDF I NRMHNQ I PQ VTS DGKR 
LQV I VCS ATLH S FD VKKLS EKI MHF PTWVDLKGE DS VPDT VHHV 
WP VN P KTDRLWERLGKSH I RTDDVHAKDNTR PG ANS PEMWSEA 
I KILXGB YAVRA I KEHKMDQA 1 1 FCRTKIDCDNLEQ YF IQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 
RGID1HGVPYVINVTLPDEKQNYVHRIGRVGRAERMGLAISLVA 
TEKE K\Afl YHVCSSRGKGCYNTRLKBDGGCTI WYNEMQLLSE IEE 
HLNCTISQVEPDI KVPVDEFDGKVTYGQKRAAGGGS YKGHVDI L 
APTVQELAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGLEVFKEQELQP3DKGAVPEDASTERSAMASLGLQLVGYILG 
LLGLLGTLVAMLLPSWKTS S YVGAS IVTAVGFS KGLWMECATHS 
TGITQCDIYSTX.LGLPADIQAAQAI^WTSSArSSLACIISVVGM 
RCTVFCQ E SRAKDR VAVAGG VF F I LGGLLGF I P VAWNLHG I LRD 
FYS PLVPDSMKFE IGE ALYLG IISSLFSLIAGII LCFSCS CQRN 
RSNYYDAYQAQPLATRSSPRPGQPPKVXSEFNSYSLTGYV 


5575 


456 


766 


LLWALPCPPPTAAAVLLSSTGLMELLEXMLALTLAKADSPRTAL 
LCSAWLLTAS FSAQQHKGSLQKDPLLSQAC VGC LEALLDYUDAR 
SPDIGRNSPHYLMFP 


"5576 


249 


2146 


RS WGAP W F WRMRLLRRRtlMP DlLAM VGCA F VLF LFLL^ttRDVS S R 
EEATEKPWLKSLVSRKDHVLDLMLEAMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGPYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DQKFRRCPPLATTSVIIVFHNEAWSTLLRTVYSVLHTTPAILLK 
EIILVDDASTEBHLKEKLEQYVKQLQWRWRQEERKGLITARL 
I/lASVAQAEVLTFLDAHCECFHGWLEPLIiARIABDKTVVVSPDI 
VTIDLNTFEFAXPVQRGRVHSRGNFDWSLTFGWBTLPPHEKQRR 
KDETYPIKSPTFAGGLFSISKSYFEHIGTYDNQMEIWGGENVEM [ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=*Lysine, 
li=Leucine, M«Methionine, N=Asparagine, 
P=»Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W« Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) i 








SFRVWQCGGQLElIPCSVVGHVFRTKSPHTFPKtiTSVIARNQVH 
LAEVWMDSYKKIFYRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HNFSWYIJINVYPEMFVPDLTPTFYGAIKNLGTNOCLDVGENNRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKGALG 
LGSCHFTGKNSQVPiCDEEWEIAQDQLIRNSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCS CGE IS VHCLP WVLFI LDLKVES SMFCPLKL I LLPVLLD 
YS LGLNDLWSPP ELTVHVGDS ALMGCVFQSTEDKC I FKIDWTL 
SPGEHAKDEY VLY YYSNLSVP IGRFQNRVHLMGDI LCNDGSLLL 
QDVQEAVQGTYICEIRLKGESQVFKKAVVhW/hPEEPKBUWHV 
GGLIQMGCVFQSTEVKHVTKVEW I FSGRRAKEEI VFR YYHKLRM 
S VEYSQS WGHFQNRVNLVGDI FRNDGS IMLQGVRESDGGNYTCS 
iHLGNLVFKKTIVLHVSPEEPRTLVTPAALRPLVIjGGNQLVIIV 
G I VCAT I LLLPVL I L I VKKTCGNKS SVNSTVLVKNTKKTNPE I K 
EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMKPV 
WPS LRSDRNNS LE KKSGGGM P KTQ QAF 


5578 




783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD " 

M FGDFSS FRALLEP ELRPEDRI LVLGCGNS AIjS YEL FLGGFPNV 

TSVDY3SWVAAMQARYAHVPQLRWBTMDVRKLDFPSASFDWL 

EKGTLDALLAGERDPWTVSSEGVHTVDQVLSEVSRVLVPGGRFI 

SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 

LSVAQLALGAQII^PPRPPTSPCFIjQDSDHEDFLSAIQL 


5579 


3 


1540 


rnsglargas alarhggglaggvg wdcgacas rcqg vmeglltr 
cralpalatcsrqlsgyvpcrfhhcaprrgrrlllsrvfqpqni* 
redrvlslqdksddltcksqrlmlqvgliypaspgcyhllpytv 
rameklvrvidqemqaiggqkvnmpslspajblwqatnrwdlmgk 

BLLRIiRDRHGKS YCIiGPTHEEAITAL IASQKKLSYKQLP FLLYQ 

vtrkfrdeprprfgllrgrfcifymkdmytfdsspeaaqqtyslvc 
daycslfnklglpfvkvqadvgtiggtvshefqlpvdigedrla 
icprcsfsanmetldlsqmncpacqgpltktkgievghtfylgt 
kyss i fnaqftnvcgkptlaemgcyglgvtri laaaievlstbd 

CVRWPSLIAPYQACLIPPKKGSKjEQAASELIGQLYDHITEAVPQ 
LHGEVLLDDRTHLTIGNRLKDANKFGYPFVIIAGKRALEDPAHF 
EVWCQNTGBVAFLTKDGVMDLLTPVQTV 


5580 


1681 


450 


ADAGTRC I PGFWPSGAGYSAPAQRGRRS SGR^RAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWC5VSS 
GPSRYVWMQELFRGHSKTREFLAHSAKVHSVAWSCDGRRLASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
ASGDKTIRI WDVRTTKCIATVNTKGENINI CWS PDGQTIAVGNK 
DDWTFIDAKTHRSKAEEQFKFEVNE I SWNNDNNMFFLTNGNGC 
INI LS YPELKPVQS INAHPSNCIC IKFDPMGKYFATGSADALVS 
LWDVDELVCVRCFSRLDWPVRTLSFSHDGKMLASASEDHFIDIA 
EVETGDKLWEVQCESPTFTVAWHPKRPUiAFACDDKDGKYDSSR 
EAGTVKLFGLPNDS 


" 5581 


54 1 ■ 


947 - " 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CS PDPQ SSTMNPVYSPVQPGAP YGNP KNMAYTGYPTAY P AAA PA 
YNPSLYPTNSPSYAPEFQFLHSAYATLLMKQAWP0NSSSCGT2G 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPWP YQTAM YP IRSA YPQQJJLYAQGAYYTQPVYAAQPHVIHH 
TTWQPNSIPSAI YPAPVAAPRTNGVAMGMVAGTTMAMSAGTLL 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


5582 


5715 


2739 


I ITNNNNVI I PIiVI AYHLSGSAQARGERS PAERLMERQKRKADI 
EKGLQFIQSTLPLKQEEYEAFLLKLVQNLFAEGNDLFREKDYKQ 
ALVQYMEGIiNVADYAASDQVAL PRELLCKLHVNRAAC YFTMGL Y 
EKALEDSEKAU5LDSESIRALFRKARALNELGRHKEAYECSSRC 
SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSL 
VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAIj 



346 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=»Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, | 
L=Leucine, M^Methionine, NaAsparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSFGSTRGSLDiCPDSFMSETNSQDHRPPSGAQKPAPSPEPCMPN 
T ALLI KN P LAATHEF KQACQLCY PKTGPRAGDYT YREGLEH KCK 
RDILLGRLRSSEDQTWiCRIRPRPTKTSFVGSYniCKDMINKQDC 
KYGDNCTFAYHQEEIDWTEERKGTLNRDLLFDPLGGVKRGSLT 
I AKLLKBH QG I FTFLCE ICFDSKPRIIS KGTKDS PS VCS NLAAK 
HSFYNNKCLVHIVRSTSLKYSKIRQFQEHFQFDVCRHEVRYGCL 
REDSCHFAHSFIELKVWLLQQYSGMTHEDIVQESKKYWQQMEAH 
AGXASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGCWEPDKDLK 
YCSAKARHCWTKERRVLLVMSKAKRKWVSVRPLPSIRNFPQQYD 
LCXHAQNGRKCQ YVGNCS FAHS PE ERDM WTFMKENK I LDMQQT Y 
DMWLKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 
CGKNSNSKKQWQQHIQSEKHKEKVFTSDSDASGWAFRFPMGEFR 
LCDRLQKGKACPDGDKCRCAHGQEELNEMLDRREVLKQKIiAKAR 
KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNVV 
HQLSVTLEDIiYNGVTKKLALQKNVICSKCEGVGGKKGSVEKCPL 
CKGRGMKIHIQQXGPGMVQQIQTVCIECKGQGERINPKDRCeSC 
SGAKVIREKKI 1 E VHVEKGMKDGQKILFHGEGDQ5PELEPGDVI 
IVLDQKDHSVFQRRGHDLIWKMKIQLSEALCGFKKTIKTLDNRI 
LVITSKAGBVIKHGDLRCVRDEGMP IYKAPLEKGILI IQFLVIF 
PEKHWLSLE KLPQLEALLP ? RQKVR ITDDMD Q VE LKE FC PNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


12tt 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE" " 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAX KEGGSG S PSFS S PMD 1 FDMFFGGGGRMARERRGKKTW 
HQhSVTLEDLYNGVTKKLAhQKNVlCEKCBGVGGKKGSVEKCPL 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVI REKKI IE VHVEKGM XDGQK IL FHGEGDQEPELEPGDVI 
I VLDQ KDHS V FQRRGHDL IMKMKI QLS E ALCG FKKT I KTLDNR1 
LVI TS KAGEVI KHGDLRCVRDEGMPI YKAPLEKG I LI IQFLVIF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2*19 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHS LTY ATI LEMQ AMMTFDPQ DI LLAGNMMKEAQM LCQRH RR KS 
SVTDSFSShVNRP TLGQFTEEE IHAE VC YAKCLLQRAALTFLQD 
ENMVSF I KGGI KVRNS YQTYKELDSLVQSSQYCKG2NHPHFEGG 
VKLGVGAFNLTLSMLPTR I LRLLEFVG FSGNKD YGLLQLE EGAS 
GHS FRSVLCVMLLLCYHTFLTFVLGTGNVNIEEAE KLLKPY LNR 
YPKGAI FLFLAGR I E VI KGNIDAA I RRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRY FS SNP I SL P VPALEMM YI WNGY AV I GKQPKLTDGILE 1 1 TX 
AEEKLEKGPENEYSVDDECLV1CLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLIPNALLELALLLMEQDRNBEAIKLLESAKQ 
N YKNYS ME S RTH FR I QAATLQAKSSLBNS SRSMVS S VS L 


" 558<J " 


2^19 '"■ 


915 


LPAGTPES SLHEALDQCMTALDL FLTNQ FS EALS YLKPRTKESf~ 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKEAQMLCQRHRRKS 
S VTDS FS S L VNRPTLGQFTE EE I HAE VC YAKCLLQ RAALTFLQD 
ENMVS F I KGG I KVRNS YQTYKELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTLS ML PTRI LRLLE FVG FSGN KD YG LLQLE EGAS 
GHSFRSVLnMLiLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
i FK5A1J? Ltf LA^KIfcVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMS YFYADLLSKENCW3 KATYI YMKAAYL 
SHFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNP ISLPVPALEMMYI WNG YAVIGKQPKLTDGI LEI I TK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANEKK I K YDHYL I PNALLE LALLLMBQDRWEEA I KLLESAKQ 
NYKNYSMESRrHFRlQAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 


148 


SSAVPDGAVGRPVAVAVGGPPHSCROIPCCLMAAIGVHLGCTSA 
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SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Pj= Phenylalanine, GtGlycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=sAsparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S»Serine, T-Threonine, V« Valine, 
W=Tryptophan, Y«Tyroaine, X= Unknown, *=Sto? 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDX5RAG WANDAGDRVTPAWAXSBNEE I VGLAAKQS RI 
RNISNTVMKVKQILGRSSSDPQAQKYIAESKCLVIBKNGKLRYE 
IDTGEETKFVKPEDVARLIFSKMKETAHSVLGSDANDVVITVPF 
DFGBKQKNALGEAAHAAGFNVLRLIHEPSAALIAYGIGQDS PTG 
KSNILVFKLGGTSLSLSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
TIAQYLAS EFQRS FKHDVRGNARAMMKLTNSAEVAKHSLSTLGS 
ANCFLDSLYEGQDFDCNVSRARFELLCS PLFNKCI EAIRGLLDQ 
NGFTADDINKWLCGGSSRIPKLQQLIKDLFPAVELLNSIPPDE 
VIPIGAAIEAGILIGKENLLVEDSLMIECSARDILVKGVDESGA 
SRFTVLFPSGTPL PARRQHTLQAPGS I S SVCLELYESDGKNS AK 
E ETKFAQ WLQDLDKKENGLRD I LAVLTMKRDG S LHVTCTDQ ET 
GKCEAISIEIAS 


5588 


3 


589 


T PP P PEQAMVAATVAAAW LLLWAAACAQQEQD F YDFKAVN I RG K 
LVSLEKYRGSVSLVVWASECGFTDQHYRALQQLQRDLGPHHFN 
VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQITALVRKLILLKREDL 


5569 


1884 


553 


LRQAWHEGGIGQTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRSPFSSFLPRPICLSLEARPCS1EDRRNWSLIGRPGAPAS 
GLNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 
ATU3AAGQPLGGESICSARAPAKYSITFTGKWSQTAFPKQYPLF 
RPPAQWSS ZjLGAAHS SD YSMWRKNQY VS NGLRDPAERGEA WALM 
KEI EAAGEALQS VHAVFSAPAVPSGTGQTS AELEVQRRHSLVS ? 
WRI VPS PDWFVGVDS LDLCDGDRWREQAALDLYPYDAGTDSG? 
TFSSPNFAT1PQDTVTEITSSSPSHPANSFYYPRLKALPPIARV 
TLLRLRQS PRAF 1 P P A P VL PSRDNEI VDS AS V PBTPLDCEVS LW 
SSWGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEEABCVP 
DNCV 


5590 


72 


896 


LCSSGALRLLPAMVAWRSAFLVCLAF^tATLVQRdSGDFDDFNL 
EDAVKETS S VKQP WDHT17TTTNRPGTTRAPAKP PGS GLDLADA 
LDDQDDGRRKPG1GGRERWNHVTTTTKRPVTTRAPANTLGKDFD 
LADALDDRNDREDGRRKPIAGGGGFSDKDLED I VGGGE YKP DKG 
KGDGRYGS NDDPG SGMVAEPGT I AG VASALAMAL I GAVS S Y I S Y 
QQKKFCFS I QQGLNADYVKGENLEAWCEEPQVKYSTLHTQSAE 
PPPPPEPARI 


5591 


68 


1494 


AGSSRRAAAERLLVaAGCRSLAGRASGVLLLPAELLPGEEEAMA 
LRVTRNSKINAEKKAK I NMAG AKRVPTAP AATSKPGLRPRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPILVDTASPSPMETSG 
CAPABEDL CQAFSDVIhAVNDVDAEDQADPNLCSBYVKDl YAYL 
RQLEEEQAVRPKYIiIiGREVTGNMRAILIDWLVQVQMKFRLLQET 
MYMTVS I IDRFMQNNCVPKKMLQLVGVTAMFIASKYEEMYPPE I 
GDFAFVTDNT YTKHQIRQMEMKI LRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
I LDNGE WTP TLQHYLS YTEESLL PVMQHLAKNAAM VNQGLTKHM 
TVKNKYATS KHAKI STLPQLNSALVQDLAKAVAKV 


5592 


242 


924 


YGESKDWNQKDLLSALVLTTVNCLPTPIMAKSAEVKLAIFGRA6 ' 
VGKS AI/WRFI»rKR FI WE YDPTLES TYRHQATIDDEWSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
IKKPKNVTLILVGNKADLDHSRQVSTEEGEKLATELACAFYECS 
ACTGEGNITEI FYELCREVRRRRMVQGKTRRRSSTTHVKQAIKK 
MLTKISS 


5593 


3 


TTT5 ' 


HASGGRAANMA^RGAGQQQSQEMMEVDRRVESEESGDEEGKKK 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDLNHYRIGKIEGFEVLXKVKTLCbRQNLlKCIENLEELQSLR 
BLDL YDNQ I KKI ENLEALTELE I LD I S FNL LRNI EG VDKLTRLX 
KLFLVNNKISKIENLSNLHQLQMLEU3SNRIRAIENIDTLTNLB 
SLFLGKNKITiO^NLDALTxVLTVI^MQSNRLTKrEGLQNLVOTR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHIjTELQ 
EFW^DNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
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NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L» Leucine, M=Methionine, N^Asparagine, 
PeProIine, Q=Glutamine, R»Arginine, 
S°Scrine, TVThreonine, V=Valine, 
W^Tryptophan, Y*=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLALPS VRQ IDATFVR F 


5594 


3 


1113 


HASGGRAANMAAERGAG QQQSQEMME VDRRVES EESGDE EG KKH 
SSGIVADLSEQSLKDGEERGEEDPEESHBLPVDMETINLDRDAE 
DVDLNHVrRIGKIEGFEVLKKVKTLCLRQNLIKCIENLEELQSLR 
ELDLYDNQIKK I ENLE ALTELEI LDI SFNLLRNI EGVDKLTRLK 
KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAI3NIDTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKI EGLQNLVNLR 
EL YLS HNG I E V I EGLENNNKLTW LD I ASNR I KK 1 2NISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQIDATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPRAWRPVGRTLGSE 
PIALAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDX 
LGI PT VPGKVTLQKDAQNLIGI S IGGGAQ YCPCLYI VQ VFDNTP 
AALDGT VAAGDE I TGVNGRS I KGKTKVE VAKM I QE VKGEVT I HY 
NKLQADPKQGMSLDIVLKKVKHRLVEWMSSGTADALGLSRAILC 
NDGLVKRLEELERTAELYKGOTEHTKNLLRAPYELSQTHRAFGD 
VFSVIGVRE PQPAASEAFVKFADAHRS IEKFG I RLLKTIKPMLT 
DLNTYLNKAI PDTRLTI KKYLDVKFEYLSYCLKVKEMDDBSYSC 
IALGEPLYRVSTGNYEYRLrLRCRQEARARFSQMRKDVLEKMEL 
LDQKHVQDI VFQLQRLVSTMSKYYNDCYAVLRDAD VFP I EVDLA 
HTTLA YGLNQE E FTDGE EEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


5595 


698 


219 


GAVLAPSSLPAAELAAQGE S QSLBDLSNTSR PTS E VYK IS F I FP 

ngdkydgdctr'tssgiyerngigihttpngivytgswkddkmng 
fgrleh fsgavyegqfkbnm fhglgtytf png akytgn fnenrv 
kgegeythiqgtrmdwtfhftscsqt 


5597 


3 


731 


ISCKMAADGQSSLPASWRSVTLTHVEYPAGDLSGHLLAYLSLSP 
VFV I VGFVTLI I FKRELHTI S FLGGLALNEGVNWL I KNVIQEPR 
PCGGPHTAVGTKYGMPSSHSQFMWFFSVYSFLFLYLRMHQTNNA 
RFLDLLWRHVLS LGLLAVAFLVSYSR VYLLYHTWS Q VL YGGIAG 
GLMAI AWF I FTQE VLTP L FPRI AAWP VSEFFL I RDTS L I PN VLW 
FEYTVTRAEARNRQRKLGTKIiQ 


5596 


32* 


2440 


G IGPIAAS FIFCKVASLYI FLSPPPPS VSG VPYSPANSS WS CAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLS FKLLLLAVALGF FEG 
DAKFG E RM EGSG ARRRRCLNGNPPKRLKRRDRRMMS QLELLSGG 
EMLCGGFYPRLSCCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHSQS LFHS PEREVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEI SRKHKHNCFCIQEVVSGLRQP VGALHSGDGSQR 
LFILEKEGYVKILTPEGEI FKEPYLDIHKLVQSGIXGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRVVEYTVSRK 
NPHQVDLRTARVFLE VAELHRKHLGGQLLFGPDGFLY 1 1 LGDGM 
I TLDDMEEMDGLSDFTGSVLRLD VDTDIOCNVP YS I PRSNPH FNS 
TNQPPEVFAHGLHDPGRCAVDRHPTD ININLTILC SDSNGKNRS 
SARILQI I KGKD YE S E P5LLE FKPFSNG P LVGGF VYRGCQS ERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQBKPLCLGTSGSCRGYFSG 
HILGFGEDEI/3EVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRKGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLS FXLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGF YPRLS CCLRSDS PGLGRLEN K I FS VTNNTECGKLLEE 
1 JtLALiCo PHi> y£> i* FHS PEREVLER DLVLPLLCKD YCKEFF YTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCF CI QEVVSGLRQP VGALHSGDGSQR 
L FILEKEGYVfCILTPEGEI FKEP YLDIHKLVQSGI KGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRWEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI I LGDGM 
ITLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYSIPRSNPHFNS 
TNQ P PE VFAHGLHDPGR CAVDRHPTDININLT I LCSDSKGKNRS 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DoAsparbic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«l3oleucine, K»Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possii>le nucleotide insertion) 








S AR I LQ 1 1 KGKD YES EPS LLE FKPFSNGP L VGG FVYRG CQS E RL ' 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILG FGEDE LGEVYI LS S S KSMTQTHNG KLYKI VDPKRPLMP EE 
CRAT VQ PAQTLTS ECS RLCRNG YCT ?TG KCCCS PG WEGD FCRTG 


5600 


1977 


1244 


SLRVliSGHLMQTRDLVQPDKPAS PKF I VTLDGVPS PPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAXCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPT I NVP PRHALKWIRPQTSE 


5601 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDM CFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQ LED PUGS F 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
T F YHPT I NVPPRHALKW I R PQTSE 


5^02 


246 


766 


YHTSCTVWRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRHAEMVAIDQVLDWCRQSGKSPSEVFEHTVLYVTVEPCIMC 
AAALRLMKIPLWYGCQNERFGGCGSVLNIASADLPNTGRPFQC 
I PG YRAEEAVEMLKTFYKQENPNAPKS KVRKXECQQ ILNMF 


5603 


1 


565 


FRGRT P I SGGERGCAQY P I PATPARSGENRTMPGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CFGFBDLHFRWTYNSSDAFK I LI EGTVKNEKSDP KVTLKDDDR I 
TLVG S TKEKRNN I S I VLRDLE FS DTG KY TCHVKNPKENNLQHHA 
TIFLQWDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT 
DQGQVALGGHYMAEGEGYFAMSEDELACSPYIPLGGDFGGGDFG 
GGDFGGGDFGGGDFGGGGS FGGHCLDYCESPTAHCNVLNWEQVQ 
RLDGILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKR1GVR 
DVRLNGSAASHVLHQDSGLGYKDLDLIFCADLRGEGEFQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
S LSNNSG KNVELKF VDS LRRQFE FSVDS FQI KLDS LLL F YECS E 
NPMTETFHPTIIGESVYGDFQEAFDHLCNKIIATRNPEEIRGGG 
LLKYCNLLVRGFR PASDE I KTLQRYMCSRFFI DFSDIGEQQRKL 
ESYLQNHFVGLEDRKYEYLMTLHGVVNESTVCLMGHERRQTLNL 
I TMLA I R VLADQNV I PNVANVTC Y YQ P AP YVADANFSN YYI AQ V 
QPVFTCQQQTYSTWLPCN 


S605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAP VRLG RKR PL PACPNPLF VR WLTE WRDEAT RS RHRTR FVFQ 
KALRSLRRYPLPLRSGKEAKILQHFGDGLCRMLDERLQRHRTSG 
GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP 
ARHSGARVI LLVLYREHLNPNGHHFLTKE ELLQRCAQKS PRVAP 
GS ARP WP ALRS LLIIRNLVLRTHQ PARYS LTPEGLELAQ KLAESE 
G LS L LNVG I G P KE P PG EETAVPGAAS AELASEAG VQQQ PLELRP 
GBYRVLLCVDIGETRGGGHRPEIJjREI^RLHVTHTVRKIiHVGDF 
WA/AQETNPRDPANPGELVLDHIVERKRLDDLCSS1IDGRFREQ 
KFRLKRCGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 
F FVKRTAD I KE S AAY LALLTRGLQ R L YQGHTLRSRPWGTPGNPE 
SGAT^TSPNPLCSLLTFSDFNAGAIKNKAQSVREVFARQLMQVRG 
VSGEKAAALVDRYSTPASLLAAYDACATPXEQETLLSTIKCGRL 
QRNLG PAL S RTLSQL YCS YGPLT 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSSVG3ISEEETCEKLKGLIQRQVQMCKRNLEVMDSVRRGAQLA 
IBECQYQFRNRRMNCSTLDSLPVFGKWTQGTREAAFVYAISSA 
GVAFAVTRACS S G ELE KCGCDRT VHG VS PQG FQWSG CS DNI AYG 
VAFSQS FVDVRERSKGASS SRALMNLHNNEAGRKAI LTHMRVEC 
KCHGVSGSCEVKTCWRAVPPFRQVGHALKEKFDGATEVBPRRVG 
SSRALVPRNAQFKPHTDEDLVYLEPSPDFCEQDMRSGVLGTRGR 
TCNKTSKAIDGCELLCCGRGFHTAQVELAERCSCKFHWCCFVKC 
RQCQRLVELHTCR | 
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SEQ 
ID 

w v : 


Predicted 

beginning 

nucleotide 

location 

corresponding 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
{A=Alanine, C«Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, MaMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


" 5607 


521 


141 


PPVCNPAE^PSPGTVCSI^LLCWr.WLDIAMAGSSFLSPEhQR^r" 
QQRKESKKPPAKLQPRALAGWLRPEDGGQAEGAEDELEVRFNAP 
FDVGI KLS GVQ YQQHSQALG KFLQD I LW E EAKEAPADK 


5608 


2 




WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
RIQTEPKYTGIWHCVRDTYHRERVWGFYRGLLLPVCTVSIiVSSE 
VFGTYRHCLAHICRLRFGNPDAKPTKADITLSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPLAVPPMCPVPPACPBP 
KYRGPLHCLATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 
VLCE WLS PAGHS RP D VPGVLVAGG CAG VLAWAVATPMDV I KSRL 

QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 
MWF VAYE AVLR LARG LLT 


5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KRIREAKRSARPELKDSLDWTRHNYYESFSLSPAAVADNVERAD 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCGEDNDGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 
EHPKRRKL LED YKVP KF FTDDLFQ YAGE KRR PP YR WFVMG PPRS 
GTGxHIDPLGTSAWNALVQGHKRWCLPPTSTPRELIKVTRDEGG 
NQCDEAITWFNVIYPRTQLPTWPPEFKPLEILQKPGETVFVPGG 
WWHWbNIiDTTIAITQNFASSTNFPVVWHKTVRGRPKLSRKWYR 
ILKQEHPELAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 

SECESGSEGDGTVHRRKKRRrCSMVGNGDTTSQDDCVSKERSSS 
R 


5610 


54 


1196 


LERTPASADMAWTKYQLFl^GI^VT^SihJTLSAKWAJDNFMAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQP FNPLL FLPPALCDMTGTSLMYVALNMTSASS FQMLRGA 
VI I FTGLFS VAFLGRRLVLSQWLGI LATI AGL WVGLADLLS KH 
DSQHIOiSEVITGDLLIIMAQIIVAIQMVLEEKFVYKHNVHPLRA 
VGTEGLFGFVILSLLLVPMYYIPAGSFSGNPRGTLEDALDAFCQ 
VGQQPLI AVALLGNISS I AFFNFAGI S VTKELSATTRMVLDSLR 
TWI WALSLALGWEAFHALQI LGFLILLIGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 - 




FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPALAMSG 
ELSNRFQGGKAFGLLKARQERRLAEINREFLCDQKYSDEENLPE 
KLTAFKEKYME FDLNNEGE I DLMSLKRMMEKLGVPKTHLBMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVLFCLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPEMPQYGEENHIFELM^AMWLCKHLNS 
S LLTL EWL I LNE FS YTATEARRLYLQRKT VPS ALLVQLI QERLA 
EEDCIKQGWILDGIPETREQALRIQTLGITPRHVIVLSAPDTVL 
I BRNLGKRI DPQTGE I YHTT FD WP PE S E I QNRLM VPEDI S ELET 

AQKLLEYHRNIVRVIPSYPKILKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVLLLGPVGS 


5613 


115 


1279 


RG VD PALR RAE KMLPLS I KDDE YKP P KFNL FGKI SG WFRS I LSD 
KTSRKLFFFLCLNLSFAFVELLYGIWSNCLGLISDSFHMFPDST 
AILAGLAASVISKWRDNDAFSYGYVRAEVLAGFVNGLFLIPTAF 
F I FSEGVERALAPPDVHHERLLLVSILGFWNLIG I FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDGPSLKETTGPSRQILQGVFLHILADTLGSIGVI 
ASAIKMQNFGLMIADPICSILIAILIWSVI PLLRESVGILMQR 
TPPLLBNSLPQCYQRVQQLQGVYSLQEQHFWTLCSDVYVGTLKL 
I VAPDADARWILSQTHNI FTQAG VRQLYVQ IDFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER " 

APWGARORLGVMAFT.flrtT.fiPP'PTD'Pr'DWivT DnvtuoiM tnnnnifrt 

EDN YVQATDKRKALEE TMA FTTQALAS VAYQ VGNLAGHTLRMLD 
LQGAALRQ VEARVS TLGQMVNMHMEKVARRE I GTLATVQRL P PG 
QKVI A PENL PPLTP YCRR PLNFGCLDD I GHG I KDLS TQLSRTGT 
LSRKSIKAPATPASATLGRPPRIPEPVHLPWPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDBLGLPPPPPGF 
GPDEPSWVPASYLEKVVTLYPYTSQKDNELSFSEGTVICVTRRY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C= Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 
SDGWCEGVSS BGTG FP PGNYVEPSC 


5615 


9 


1558 


ALGRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGISFVQTLMHLLKGNIGTG 
LLGLPLA I KNAG I VLG PISLVFIGI I S VHCMHI LVRCSHFLCLR 
FKKSTLGYSDTVSFAMEVSPWSCLQKQAAWGRSWDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LR I YMLCFLP FI I LLVFIRELKNLFVLS FLANVSMAVSLVT I YQ 
YWRNKPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKES KR F PQALNIGMG I VTTLYVTLATIX5 YMC FHDE I KGS I T LN 
LPQDVWLYQSVKILYSFGIFVTYSIQFYVPAEIIIPGITSKFHT 
KWKQICEFGIRSFLVSITCAGAILIPRLDIVISFVGAVSSSTLA 
LILPPLVEILTFSKEHYNIWMVLKNISIAFTGWGPLLGTYITV 
EE IIYPTP KWAGTPQS PFLNLNSTCLTSGLK 


5616 


1 


719 


DDFVRCGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE ' 
LKHLSSGDLLRDNMIiRGTEIGVLAKAF I DQGKL I PDDVMTRIiAL 
HELKNLTQYSWLIiDGFPRTLPQAEALDRAYQlDTVINLNVPFEV 
I KQRLTARW IHPASGRVYNI EFNPP KTVG 1DD LTGE PL I QREDD 
KPETVIKRLKAYEDQTKPVLEYYQ30CGVLETFSGTETNKIWPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAETTVS SVG 
T CEAAG KS PE PXDYDS TCVFCR I AGRQDPGTELLHCENEDL 1 CF 
KDI KPAATHH YIiWPKKHIGNCRTLRKDQVE LVENMVTVGKTIL 
ERimFTDFTlTVRMGFHNPPFCSISHLHLHVLAPVDQLGFLSKLV 
YRVNS Y WF I TADHL 1 E KLRT 


5618 


3 


1692 


VLNY'lMLKSENkLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKSIRLLSEIEKLVGTSVPGLLEIIIjSSSIIiEIYN 

h i lqt wpded vtfrks catkr klsn inqe eas gtslhqka i mt 
ftchneinafwlsrgsqilslnstrfltklghcssacpsdsvs 
qtniqnlkglnspvligkskdpscvakvseegkpaigtqkmelh 
vrwrsdtgkcvdasplwiptfdksstrvyigshshrmkavdfy 

SGKVKW3QI LGDR I ESS ACV S KCGNF I WGCYNGL VYVLKS NSG 
EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKXC 
VWKSKCGGTVFSSPCLNLI P HHL YFATLGGLLLAVN P ATGNV I W 
KHSCGKPLFSSPQCCSQYICIGCVDGNLLCFTHFGEQVWQFSTS 
GP I FSS PCTS PS EQ KI FFGSHDCFI YCCNMKGHLQWKFETTS R V 
YATPFAFHNYNGSNEMLLAAASTDGKVWILESQSGQIiQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSU3SPPGAGRGCPCP ~ 
AQSLHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGS PS SPGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 

TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


*620 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAlEAIKLGST • 

AlGIQTSEGVCZiAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 

LIADAKTLIDKARVETQNHWFTYNETMTVESVTQAVSNLALQFG 

EEDADPGAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFVQCDAR 

AIGSAS EGAQSSLQEVYHKSMTIiKEAI KS SLI I LKQVMEEKLNA 

TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 ■ - 


819 


WEFVEYTATDANVKNESLSSVQQLGIKMTVRYGKFLSLLKDGA 
ENDLTWVLKHCERFLKQQQTSIKSSLLCLQGNYAGHDWFVSSLF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLKISSYLPNDTVES 
GIHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 

HTQTQDLQVFLKEEAliHGFRVSDYFEYMEILEQNYRTVLLRDMR 
NIRLQST 


5622 


1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRIjNMNPQIRNPMKAMY " 
PGTF YFQFKNLWEANDRNETWLCFTVEGI KRRSWSWKTGVFRN 
Q\T)SETHCHAERCFLSWFCDDILSPNTKYQVTWYTSMSPCPDCA 
GEVAEFLARHSNVNLTIFTARLYYFQYPCYQEGLRSLSQEGVAV 
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SEQ 
ID 

NO: 


Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C»Cysteine, D=Aapartic Acid, E= 
Glutamic Acid, F= Phenyl alanine. G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, RaArginine, 
S= Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\~possible nucleotide insertion) 








EIMDYEDFKYCWENPVYNDNEPFKPWKGLKTNFRLLKRRLRE^C"" 
Q 


5623 


3 


954 


FLPFFIRAPKIS RNGQWLFTFTTP FP FANKAL PGWEG I VP ACFW 
RKKILTPSTGTMELLQVTILFLLPSICSSNSTGVLEAANNSLW 
TTTKPSITTPNTESLQKNVVTPTTGTTPKGTITNEbLKMSLMST 
ATFLTSKDBGLKATTTDVRKNDS 1 1 SNVTVTSVTLPNAVSTLQS 
SKPKTETQSSIKTTEIPGSVLQPDASPSKTGTLTS1PVTIPENT 
SQSQVIGTEGGKNASTSATSRSYSS I ILPWIALIVITLSVFVL 

VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5624 


159 


898 


PG VAAAAG AL PQYHG PAP AL VS C RRELS LS AGS LQLER KRRDFT 
SSGSRKLYFDTHALVCLLEDNGFATQQAJBI I vsalvki leanmd 
IVYKDMVTKMQQEITFQQVMSQIAKVKKDMXILEKSEFSALRAE 
NEKI KLELHQLKQQVMDE VI KVRTDTKLDFNLEKSRVKELYSLN 
EKKLLELRTEIVALHAQQDRALTQTDRKIETEVAGLKTMLESHK 
LDNI KYLAGS I FTCLTVALG FYRLW I 


" 5625 


1 


1180 


TIPSSAAAQRAGPPAG^EALSPGGARAHAERRGEMRATPLAAP 
AGS LS RXKRLE LDDNLDTE R P VQKRARS G PQPRL PP CLL PL S P P 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTBYTCR 
VYPVQEALAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHSLVRSRHRIPEPEAAVLFRQMATALAHCHQHGLVLRDLKL 
CRFVFADRERKKLVLENLEDSCVLTGPDDSLWDKHACPAYVGPE 
ILSSRASYSGKAADVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRG AY ALP AG LS APARCL VRCLLRR E PAERLTATG I LLHPW LRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 


3123 


2011 


P PRALGS VAMENQVLTPH VYWAQRHRE LYLRVELSDVQNPAI SI 
TENVLH FKAQGHGAKGDNV YE FHLEFLD LVKPE P VYKLTQRQVN 
ITVQKKVSQWWBRLTKQEKRPLFLAPDFDRWUDESDAEMELRAK 
EEERLNKLRLES EGS PET LTNLRKGYLFM YNLVQFLGFSW I FVN 
LTVRFC ILGKES FYDTFHTVADMMYFCQMLAWETINAAIGVTT 
S PVLPS L I QLLGRNFI LF1 1 FGTMEEMQNKAWPFVF YLWSAI E 
IFRYSFYMLTCIDMDMKVLTWLRYTLWIPLYPLGCLAEAVSVIQ 
S I P 1FNETGRFS FTLPYPVKI KVRFSFFLQ IYLIMI FLGLY INF 
RHLYKQRRRRYGQKKKKIH 


5627 


3123 


2011 


PPRALGSVAMEKQVLTPHVYWAQRHRELYLRVELSDVQNPAISI - 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
I T VQKKVSQ WWERLTKQ EKR PL FLAPDFDR WLDE SDAEMELRAK 
EESRLNKLRLESEGSPETLTNLRKGYLFMYKLVQPLGFSWIFVN 
LTVRFC ILGKES FYDTFHTVADMMYFCQMLAWETINAAIGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYSF YMLTCIDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 

SIPIFNBTGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5628 


75 


1455 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 
S LS P VARS FSACS VGLGRS S YRATS CLPALCL PAGGFATS YSGG 
GG W FGEG I LTGNEKETMQSLNDR LAG YXEKVRQLEQENASLESR 
I RE WCEQQ VP YMC PD YQS YFRT I E ELQK KTLCS KAENARLWE I 
DNAKLAADDFRTKYBTEVSLRQLVESDINGLRRILDDLTLCKSD 
LEAQVESLKEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 
DLNRVLEEMRCQYETLVENNRRDAEDWLDTQSEELNQQVVSSSE 
QLQSCQAEI IELRRTVNALEIELQAQHSMRDALESTLAETEARY 
S SQLAQKQCM I TNVEAQLAEI RADLERQNQE YQVLLDVRARLE C * 

EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCS ARPI CVPCPGGRF 


5629 


2287 


93 8 


GRPRSSSDNRNFLRERAGLS SAAVQTR IGNSAAS RRS PAARP PV 
PAP PALPRGRPGTEGS TS LS APAVLWAVAVWVWSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTWJLPTTOITNKLIGCYVGNTMEDVVLVRIYGNKTELLVDR 
DB E VKS FRVLQ AHGCA PQLYCT FNNGL CY E F I QGEALDPKHVCN 
PAIFRLIARQLAKIHAIHAHNGWIPKSNLWLKMGKYFSLIPTGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 

nil c T p.oh i 1^1^ 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{AsAlaniixe, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=*isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine , 
P«Proline, Q«Glutamine , R=Arginine, 
S»Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDINKHfcXSDIPSSQlIliQEEMTWMKKlliSNLGSPVVLCHNDir -1 
LCKN 1 I YNE KQGDVQF I D YE YSG YNYLAYD IGNH PNE FAG VS DV 
DYSLYPDREIjQSQWLRAYLEAYKEFKGFGTEVTEKEVEILFIQV 
NQFALASHPFWGLWAL IQAKYST I E FDFLG YAI VRFNQ Y FKM KP 
BVTALKVPB 


5630 


1194 


278 


GFWAIAQTCAHHLPPGSPWIjVPASPWRLPEMSSFGYRTLTVALF 

tliccpgsdekvfevhvrpfcklavepkgslevncsttcnqpevg 

GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
KSNVS VYQP PRQVILTLQ PTLVAVGKS FTI BCRVPTVBPLDSLT 
LFLFRGNETLHYETFGKAAPAPQEATATFNSTADREDGHRNFSC 
LAVIJ5LMSRGGNIFHKHSAPKMLEIYEPVSDSQMVIIVTVVSVL 
LSLFVTSVLLCFIFGQHLRQQRMGTYGVRAAWRRLPQAFRP 




1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAATMSVFGKLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 
TKNKRAALQALKRKKRYEKQIAQIDGTLSTIEFQREALENANTN 
TEVLKNMGYAAKAMKAAHDNMDIDKVDELMQDIADQQELAEErs 
TA1SKPVGFGEEFDEDELMAELEELEQEELDKNLLEISGPETVP 
LPNVPS I ALPS KPAKKKEEEDDDMKELENWAGSM 


5632 


a 
3 


952 


WLGWSPPRRLWWGSLGAAQRPAVPVSGLARSLHVETRRPHRRA 
S VRVARGRIjG VWAQ P QPLLPR P VG S RREMQPPG P P PAYAPTNGD 
FTFVS SAOAEDLSGS IAS PDVKLNLGGDFI KESTATTFLRQRG Y 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYKIRCVLMPMPSLGF 
NRQWRDNPDFMGPLAWLFFSMISLYGQFRWSWIITIWIFGS 
LTIFLLARVLGGEVAYGQVLGVIGYSLLPLIVIAPVLLWGSFE 

WSTLIKLFGVFWAAYSAASLLVGEEFKTKKPLLIYPIFLLYIY 
FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQLLSSHIjHQVPFFCCFTWCLCN 
CLFENSVSKLYMLCFNFFMSIFFYSLSITKLNLIYLWGLSYQSL 
LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRIRSRAAASRPRAGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLSSRFRRVDIDEFDENKFVDEQEEAAAAAAEPGPDPSEVD 
GLLRQGDM LRAFHAALRNS P VNTKNQAVKERAQGWLKVLTNFK 
SSEIEQAVQSLDRNGVDLLMKYIYKGFEKPTENSSAVLLQWHEK 
ALAVGGLGSIIRVLTARKTV 


5635 
5636 


3 


• 943 


DRGPRSTATDTGRARVSFWRFPLDPGVKN^NVQiSGEKRRFRTL*" 

RS LFHPFP VTRSGAPRAVLVGS S W PAXM VAPAVKVARG WSGLAL 

GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 

EKYVRELKKI^LIKAAPAGKTSSVFEDPVISKFTNMMMIGGNKV 

LARSLMIQTLEAVKRKQFEKYHAASAEEQATIERNPYTIFHQAL 

KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 

KKHQRTLMPEKLSHKLLEAFHNQGPVIKRKHDLHKMAEANRALA 

HYRWW 




2253 


1143 


LEDTICQHPPAEKKLYLYHRKLREVER^GIPRLPKDVFMDTHQG 
LTD VRAKVTG FSEG WDS VKGGFS S FSQ ATHSAAG AWS KPR E I 
ASLIRNKFGSADNI PNLKDSLEEGQVDDAGKALGVI SNFQSS PK 
YGS EEDCS SATSGSVGANSTTGGIAVGASS S KTNTLDMQSSGFD 
ALLHEIQEIRETQARLEESFBTLKEHYQRDYSLIMQTLQEERYR 
CERLEEQIiNDLTELHQNEILNLKQELASMEEKIAYQSYERARDI 
QEALEACQTRISKMELQQQQQQWQLEGLENATARNLLGKLINI 

LLAVMAVLLVFVSTVANCWPLMKTRNRTFSTLFLWFIAFLWK 
HWDALFSYVERFFSSPR 


5637 


946 


2532 


wsfcgaranaxmmaaynggtsaaaaghhhhhhhhlphlppphl"h 

HHHH PQHH LHPG S AAAVH PVftfiHTS <5 A aa AAAAAaahftaMT m-ar* 
QQQPYFPS PAPGQAPG PAAAAPAQVQAAAAATVKAHHHQHSHH P 
GQQLDI EPDRP IGYGAPGWWSVTDPRDGKRVALKKMPNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDI LQPPHI DYFEB 2 YWTE 
LMQSDLHKIIVSPQPLSSDHVKVFLYQILRGLKYLHSAGILHRD 
I KPGNLLVNSNCVLKI CD FGLARVEELDESRHMTQEWTQYYRA 
PEILMGSRHVSNAIDIWSVGCIFAELLGRRILFQAQSPIQQLDL 
I TDLLGTPSLEAMRTACEGAKAH I LRGPHKQP S L P VLYTLS S QA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, CaCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K*Lysine, 
L«» Leucine, MoMethionine, N*»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHLLCRMLVFDPyKRISAKDALAHPYLDEGRLRYHTCMCK " 
CCFSTSTGR VYTSDFE PVTNP KFDDT FE KNL9 S VRQ VKE I IHQF 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE 


563B 


125 


1155 


DRKMSELDQLRQEAEQLXNQIRDARKACAJDATLSQITNNIDPVG 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTNKVHAI PLRSS WVMTCAYAPSGNYVACGGLDNI CS I YNLKTR 
EGNVRVSRBLAGHTGYLSCCRPLDDNQIVTSSGDTTCALWDIET 
GQQ7TTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDIKAICFFPNGNAFATGSDDATCRLFDLRADQEL 
KT Y5HDNI I CG I TS VSFS KSGRLLLAG YDDFNCNVWDALKADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5639 


125- 


1155 


DRKMSELDQLRQEAEQLKN^iRDARKACADATLSQiimiDPVG 

riqmrtrrtlrghlakiyamhwgtdsrllvsasqdgkliiwdsy 
ttnkvhaiplrsswvmtcayapsgnyvacx;gldnicsiynlktr 
egn vr vs relaghtgyls c crflddnq i vts sgdttcalwd i et 
gqqtttftghtgd vms ls lapdtrlf vsgacd asaklwd vr egm 
crqtftghesdinaicffpngnafatgsddatcrlfdlradqel 

MTYSHDNI ICG 1TSVSFSKSGRLLLAG YDDFNCNVWDALKADRA 
GVLAGHDNRVS CLG VTDDGMAVATGS W DSFLKIWN 


5640 

i 


2B0 


1092 


O^NKKTMLSHNTMMKQRKQQATAIMKEVHGNDVDGMDLGKKVS 
IPRDZMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHS I AMQNGKVDGSNLEGGSQQAPLTP PNTPDPRS PPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLBALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 

LTDPRFMSFVNPLSGRRSFNRTPKGWISENIPIVITTEPTDDTT 
VPESEDL 


| 5641 


27 


332 


CRHNCNG DVKLLS NQ to KLFAFHLFTFHGLLH FLDG S IQKL I QA 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGBSFVLSMIVTG 


5642 


199 


1247 


ITPCRMDFLVLFLFYLAS VLMGLVL I CVCSKTHSLKGLARGGAQ 
IFSCIIPECLQRAMHGLLHYLFHTRNHTFIVLHLVLQGMVYTEY 
TWEVFGYCQELELSLHYLLLPYLLLGVNIiFFFTLTCGTNPGIIT 
KANE LLFLHVYE FDEVMFP KN VRCSTCD LRKPARS KHCS VCNWC 
VHR FDHHCVWVNNCIGAWN IRYFLI YVLTLTASAATVAI VSTTF 
LVHL WMS DLYQE T YI DDLGHLHVMDTVFLIQ YL FLTFPUI VFM 
LGFVVVLSPLIXK5YLLFVLYLAATNQTTNEWYRGDWAWCQRCPL 
VAWPPSAEPQVHRNIHSHGLRSNLQEIFLPAFPCHERKKQE 


5643 


1 


847 


PSGG\/RDVETRGPGSRAARGPRVVMERRGVGAGAIAKKKIiAEAK "" ' 
YKERGT^iAEDQLAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
RVQFQDMCATIGVDPLASGKGFWSEMIiGVGDFYYELGVQIIEVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDL I RAI KKLK 
ALGTGFGI I PVGGTYLIQSVPAELNMDHTWljQLAEKNGYVTVS 
EIKASLKWETEliARQVLEHLLKEGIiAWlJDIjQAPGEAHYWLPALF 
TDLYSQE ITAEEAREALP 


5644 


83 " 


1138 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAI ED KDMQQKEQQ FREW FLK EFPQ I RW K I QES I ERLRVXANE 
IEKVHRGCVIANWSGSTGILSVIGVMLAPFTAGLSLSITAAGV 
GLGIASATAGIASSIVENTYTRSAELTASRLTATSTDQLEALRD 
I LHD I T PNVL S FALD FDEATKM I AND VHTLR RS KATVGR PL I AW 
R YVP I NWET LRTRGAPTR I VRKVARNLG KATSG VLWLDWNL 
VQDSLDLHKGEKSES AELLRQWAQELEENLNELTH I HQSLKAG 


5645 


537 




vqsvrdlkrlsptdppgdsgnrdvtredpvtgpLnsassqvptl 

YLCU3NSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTSPHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS I PATKRAS FLS S F I KMFFEELE YILGF 
LSLLKFHVHVSVYSAICHFQKEGTGNSRSFTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


G VI MATS E LS CEVS E ENC ER REA F W AE W KDLTLSTR P E EGCS LH 
EEDTQRHETYHQQGQCQVLVQRSPWLMMRMGILGRGLQEYQLPY 
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ID 
NO: 
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beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, ^Cysteine. D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P*=Proline, Q«=Glutamine, R=Arginine, 
S»Serine, T=Threonine f V=Valins, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








QRVLPLPlFrPAKMGATKEBREDTPIQLQELLALETALGGQCVD 
RQEVAEITKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


S648 
" S649 


7 


151$ 


VLS3LCGRHEALREVGAEWPPPTCSPKICSGLQQAGNTDWSLTM 
APQSLPSSRMAPLGMLLGLLMAACFTFCLSHQNLKEPALTNPEK 
SSTKETERKETKAEEELDAEVLEVFHPTHEWQALQPGQAVPAGS 
H^LNI^TGEREAKLQYEDKFRWNLKGKRLDINTNTYTSQDLKS 
ALAKFKEGAEMESSKEDKARQAEVKRLFRPIEELKKDFDELNW 
IETDMQIMVRLINKFNSSSSSLEEKIAALFDLBYYVHQMDNAQD 
LLS FGGLQ WINGLNSTE PLVKEYAAFVLGAAFSSNP KVQVEAI 
EGGALQKLL VI LATEQ PLTAKKKVLFALCSLLRHFPYAQRQFLK 
LGGLQVLRTLVQEKGTEVLAVRWTLLYDLVTEKMFAEEEAELT 
QEMS PE KLQQ YRQ VHLLPGLWEQG WCE I T VHLLALPEHDAR EKV 
LQTLG VLLTTCR DRYRQD PQLG RTLASLQAEYQ VLAS LELQDGE 
DEGY FQ ELLGS VNS LLKELR 




1172 


3006 


klqeqldaineeirmiqeekestelraeeietrvtsgsmeai.nl 

KQLRKRGS I PTSLTDLSLASAS PPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKIjLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQEEGKSALEDQGSNPSSSNSSQDSLHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 

vpaklgtqaekdrrlkkkhqlledarrkgmpfaqwdgptwswl 
elwvgmpawyvaacranvksgaimsalsdteiqreigisnalhr 
lxlrlaiqemvsltspsapptsrtssgnvwvtheemetletstk 
tdseegswaqtlaygdmnhewignewlpslglpqyrsyfmeclv 

DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRIiNYDRKE 
LEKRR EES QHE I KDVLVWTNDQWHWVQS iglrdyagnlhesgv 
hsallaldenfdhntlali lq I ptqntqarqvmere fnnllalg 
tdrklddgddkvfrrapswrkrfrprehhgrggmlsasaetlpa 
gfrvstlgtlqpppappkkimpeahshylyghmlsafrd 


5650 


1172 


3006 


mlqeqldaineeirmiqeekestelraeeietrvtsgsmealnl 
kqlrkrgsiptsltdlslasaspplsgrstpkltsrsaaqdldr 
mgvmtlpsdlrkhrrkllspvsreenredkatikcetsppsspr 
tlrleklghpalsqeegksaledqgsnpsssnssqdslhkgakr 
kgikssigrlfgkkekgrliqlsrdgatghvlltdsefsmqepm 
vpaklgtqaekdrrlkkkhqlledarrkgmpfaqwdgptwswl 
elwvgmpawyvaacranvksgaimsalsdteiqreigisnalhr 
lklrlaiqemvsltspsapptsrtssgnvwvtheemetletstk 
tdseecswaqtiaygdmnhewignewlpslglpqyrsyfmeclv 
darmldhltkkdlrvhlkmvds fhrts lqygi mclkrlnydrke 
lekrreesqheikdvlvwtndqwhwvqsiglrdyagnlhesgv 
hgalij^denfdhntlalilqiptqntqarqvmerefnnllalg 
tdrklddgddkvfrrapswrkrfrprehhgrggmlsasaetlpa 
gfrvstlgtlqpppappkkimpeahshylyghmlsafrd 


5651 
" 5652 — 


646 


1869 


arqgqrqpwg * earakgpases prv* egsgwegpas p * tpgstl 
awgegagir*asgltaagaasaaaa/ppptrggpapagcgrapp 
wpaplrvpthgrapaprsraaprapalshgtaaaalspaspagp 
adp*lpghssqspprg*rwgrsrsapapahpehpapagsasasq 
qtpgwpgscclaqgwqaeplgapgaedg\pvppqrgfplgtlgs 
pagswaglagyg*agapgtqatapraagqtpvaaapncrv+gsa 

PALHRAPAAAD PGSPLQ AP PRAWAS P AAAG PGLS SS D YCGGLGA 

gwragispellgaaglsdnwarcpgpgpab*ggqpgcrtipasa 
cmpsp?vegslglsrkghgdlpsqar*gwhbcrrarhlvplprl 
lgprgrtgrpssps 


_ 5653 ■ 


735 


343 


HHKKYQHIHQKS FSCPEP AfflK Q Pmrv iniTit'truM vt tic rvr-rtrwy — 

* ~ ^ 4 »^. Ai ^ ivo fov,irDr i\\^\jz\C> t: lie AAJlljflKHmK 1 iHS "TRiJx I 

cefcarsfrtssnlvihrrihtgekplqceicgftcrokaslnw 
hqrkhaetvaalrfpcefcgkrfekpdsvaahrskshpallla 




66 


1401 


rgrlqsrgrltlglvlllldilgarqh6qrVshgwkggfltapl 
ctpqpcqpgtrrgrrrslkbatepqlamaeefvtlkdvgmdftl 

2DWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 

lbpmggspeatspdvtetknsplmedffeegfsqei/srdvtq 
3wllelqfrrslyrghlvr*farrsrkssev*ychqrgkshgmq 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

iAsAl.and.ne. CoCvRh^ino n->ften3vf< » ns*4#4 
\n-MXiauj.iiC| t H vyotcxjie, usflspartlC ACiQi 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine , .R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W*» Tryptophan, Y*Tyrosine, X= Unknown, *=Stop 
Codon, /=poasible nucleotide deletion, 
\»possible nucleotide insertion) 








ES * J.KERTQSCVHRFHGRRFHG\DNVSEKTLTPAKSKEYRGEFT"" 
SYSDHSQQDSvgEGEKPYQCSECGKSFSGSYRLTQHWITHTREK 
P TVHQECEQG FDR KASHSG YPKTHTG YKFYVCWF YHT P F q n<?TV 

LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 

ECGKAFTRIFHLTRHQKXHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5554 


3 


598 


TL PL FPGRRFRGWRRCGAVAAR KNS TGGNVS I NQRR DS VRMS AL 
NWKPFVYGGLAS 1TAECGTFPIDLTKTRFQIQGQTNDAKFKEI I 
YRGMLHAL VR I GREEGL KALYS G * VGLHAPLCHC S LFKMG I DFR 

PRLHRSQ VKS LR CV * KEQ I A* * /M FSLLI STL I S KY I YY AADVL 
EKL FY Y IQVQTDNNKK I CLFKN I 


5655 " 


2 


867 


RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMIPFKDEGDPQ\REKIFAEIVNPEEEGDLADIKSSLVNES 
EI I PASNGHE VARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PS YS S YSG YI MM PNMNND P YMSNGS LS P P I PRTSNKVP WQ P SH 
AVHP LT PL I T YS DEH P <? Pfi Q HP <?H T DC n\/M q irnr* u c oil tin M>r»T 
PTFYPLSPGGGGQITPPLGWQGQP 


5656 


228 


1066 


PRR VP PLPE FASG PGAAF FHSGRLQRS LTKDS AGC FSQCRS RAM 
LVLRS GLTKALASRTLAPQ VCSS FATGPRQ YDGTFYEFRTTY"LK 
PSNMNAFMENLKKNIHLRTSYSELVGFWSVEFGGRTNKVFHIWK 
YDNFPKRAE VR KALANCKEWQEQS 1 1 P>JLAR I DKQETE I T YL I P 
WS KLQ KP P KEG V YELAVFQMKPGG PALWGDAFE RAINAHVNLG Y 
TKWG VFHTE YG ELNR VH VLWWNE S ADS RAAVRHKSHEDP I S WG 
GVRES VNYL \ VSQQNM 




105 


1052 


GORLQSPRVQMPVQPPSKDTEEMEAEGDSAAEMKGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEXL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKLLLYDTLQGELQBRIQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


5658 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK ' ' 

R YNNNGE YEES SRDASRKWLEQVAATG VLLHCQSLLS PATVKEE 
RTMLEDIWVTLSELDLWTFSFKQLDENYVANTNVFYHIEGSRQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAEDLQQDIKAQSLEKVQQYYRKLRAFYLERSNLPTDAST 
TAVK1 DQL I RP INALDELCRLMKSFVH P KPGAAGS VGAGL I P I S 
SELCYRLGACQM VMCGTGMQRSTLS VSLEQAA I LARSHGLLPKC 

IMQATDIMRKQGPRVEILAKNLRVKDQMPQGAPRLYRI/^PKMN 
GDL 


5655 


2 


696 


WKRSGE VS P KGELG AWRGNS G R P KI IG RAAEAENEDRTLGRLLP 
GNERSQPRSPLRLLAPQLKAEAAADKGLAPVPPPFSSGHSGPC\ 
EREGEGQRGRGRS RRG AH LBL KPS PGLRAGAPTDRGRGG PAE VA 
AAGGRRMVQKES QATLEBRESELSSNPAASAGASLE PPAAPAPG 
EDNP AGAGG \AAVAGAAGGARRFLCG WEG FYGRP WVMEQRKEL 
FRRLQKWELNTYL 


5660 


229 


853 


PVTMWAFSELPMPLLINLTVSIJ^FVATVTT.TDaT7DriwpTA^V>"T 

CGQDLNKTSRQQIPESQSVISGAVFLIILFCFIPFPFLNCFVKE 
QRKAFPHHE FVALIGALLAI CCMI FLGFADDVLNLRWRHKLLLP 
TAASLPLLMVYFXNFGNTTIVVPKPFRPILGLHLDIiGR*SYHCC 
PYGTYFREPFLVLHILLQVPLFCLCVFPDPFW 


5661 


2 


473 


LNLYPSPCGGI PKLPGLPREAAAALGAS FLAEAPLP VTVRGSGL 
AGMAVTCDPKAFLS ICFVTLVFLQLPLAS I CQN*GTDSCAS RGK 
AD FDVTG PHAP I IiAMAGG HVE LQ CQL FPNI S AEDMELRWYRCQ P 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKLSVRDALGAQNASGERIKIQGWIRSVRSQKEVLF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
ift=Hianinc, L-uysceine, u=Aspartic Ada, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine. 
H=Hietidine, I=Isoleucine, K=Lysine, 
LaLeucine, MssMethionine, N=*Asparagine, 
P*Proline, Q«=Glutatnine, R^Arginine, 
S=Serine, ^Threonine, V»Valine, 
W»Tryptophan, Y- Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 

\=POSSible nucleotide in^prt-inn) 








LHVNDGSSIiESLQWADSGLDSRELTFGSSVEVQGQLIKiJPSKR 
QNVELKAEKI KV IGNCDAKDFP I KYKERHPLEVTLRQY PHFRCRT 
NVI/3S I LR I RS EATAA IHS FFKDSGFVHI HTP 1 1 TSNDS EGAGB 

LFQLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSOSRRHLAEF YMIKAE T <3 Fvn Q TnviT jwwt pitt p v 

ATTMri VLS KCPEDVELCHK FI APGQ KDRL * HMLKNNFLI I S YTE 
AVE ILKQASQNFTFTPEWGADIjRTEHEKYLVKHCGNI PVFVTNY 
PLTLKPFYMRDNEDGPQELEGSVA*HSLGLMILLSIWIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYnMr'PT.yrtagr'rDfzvpDgT V — 
VQS YFE KG PLTFRD VA I E P S LEE WQCLDS AQQG LYRKVMLENYR 
NLVFLGIALTKPDLITCLEQGKEPHKIKRHEMVAKPPVICSirFP 
QDLWAEODI KDS FOEAI L JCKVmcVrtW & MPm.n vra nw <z\r n c n ifrru 
KEHDNKLNQCLI PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAAIiRAliVQRTGYSLVQENGQRKYG ' 
CPPPGWDAAPPERRCRT PTrztf r.DPnT.ppriPT tdt npvTPVTVDM 

RMMMD FNGNNRG YAFVT FSNKVE AKNAI KQLNNTY E I RNGRLLG V 
CASVDNCRLFVGGIPKTKK 


S^S 


347 


702 


WQHLI ILLHCERTSPAMITS ELPVLQDSTNETTAHSDAGSELE 
ETE VKGXRKRGRPGRPPSTNKKPRKS PGEKSR I EAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666 


213 


540 


va^ujr iou\Hl I liNWU-UOPVPrNSSiiPDE YKI AALVFYS CI FI I 

GLFVNITALWVFSCTT1QCRTTVT1YMMJJVALVDLIFIMTLPFRM 
FY YAKDEW PFC3 P YPffl T T r* A 


5*67 ™ 


1 


695 


HPLPSASLGLPSVSLGVSLCVRSALLEAWPfJLPKRRRARVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHWME ETS AE EAVS WQERRMAAA P PGCTP PALLD 
ijniji wnoyr vr vc V—KHKIjei VAuro J\aj ruo r AWMFAXACQR 
PT PLTHHNTGLS BALE I LAEAAGFEGS EGRLLT FCRAAS VLKAL 
PSPVTTLSQLQ 


566B 


691 " 


894 


*»*ruijrjjvr "i-wKJ\.15&JJ.MVxjV^v*KWoIrfaxJX^ 
VLVRTAI RCAQAQTG I DLSGCTK W 


5669 


407 


1 


xjrao x vj&vjJuonil/irJiry/4.l 1 irr 1 x ijJiAKJvRKGBXGD 
ADSRFNDRYAHKSAQLYFLYFVCWIFQDVYYFTIKEKNHFFFPK 
ARGAPTKYSGSPIGSPTTTPPTRPPSFNLHPAPHLLASMQLQKL 
NSQ 


5670 


3 


373 


SSECLTMAWlPI,Ll,PLLILCTVSVASVEtAQPSSVSVSPGQ¥AK 
ITCSGDVLAKKYARWFOfiTf Pfirift.PVT.VT vtrrvrPD tj cr» t tjpd pe<*« 

STSGTTVTLTISGAQVEDEADYFCYSATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLGMESAITLWQFLLQLLLDQKHEHLICWTSNDGE 
FKLLKAKKVAKLWGIiR KNKTNMNYD KLtS RALRLL FMT 


5672 


2 


557 


FVPATPDPGVWIiPPSRDPAMAKRSSIjYIRIVEGKNxjPAKDITGS 
SDPYCI VKVDNE P I IRTATVWKTLCPFlffGEEYQVHLP PTFHAVA 
FYVMDEDAIiSRDDVIGiCVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETS PLGS VWSPAQGKP FLLS P E AGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


I TVADQ I SH WSAGR I KNRTRI PE CIHS SAATTLAGPHTMEGE S V 
KLSSQTL I QAGDDEXNQRT IT VNPAHMGKAFKVMNELRS KQLLC 
DVMI VAEDVEI EAHRWLAfir <? P Y Fnn M VTCirsM Q 


5674 


17 


984 


GGGSM EGE S TSAVL j> G FVLGAIiAFQrtLNTDSDTEG FLLGE VKGE " 
AKNSITDSQ^DVEVVYTIDIQKYIPCYQLFSFYNSSGEVNEQA 
LKKILSNVKKNWGVTYKFRRHSDQIMTFRERIaLHI^ 
DLVFLLLTPSIITESCSTHRLEHSLYKPQKGLFHRVPLVVANLG 
MSEQLG YKTVSGSCMSTG FS RAVQTHS S KFFEEDGS LKE VHKIN 
EMYASLQEELKS ICKKVEDSEQAVDKLVKDVNRLKRE I EKRRGA 
QIQAAREKNIQKDPQENI FLCQALRTF FPNSEFLHS CVMSLKID 
MFLKVAVTTTTISM 


5675 - 


80 


753 


EGSRRGPTRLARLSARAGRLRFPPGFSSRLIHFRGVSECRRPPG 
KSG VPVS APGSDGKWWEERPGMF3LMAS CCGWF KRWR E P VR KVT 
LLMVGLDNAGKTATAKGIQGE YPEDVAPTVGFS KINLRQGKFE V 
TIFDIX3GGIRIRGIWKNYYAESYGVIFWDSSDEERMEETKEAM 
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SEQ 
ID 

NO; 


Predicted 
nucleotide 

\ np3 1" i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 

corresponuiny 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histiaine, Islsoleucine, K-Lysine, 
L=*Leucine, Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S°Serine, T=Threonine, V^Valine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEMLRHPRISGKPILVLANKQDKEGALGEADVIECLSLEKLVNE 
HFCCL 




2 


930 


FVSSPPPRPVQPARPGGFGLSGRRSLLCQVASTPAHVGVMRSPV 
RDLARNDGEESTDRTPLLPGAPRAEAAPVCCSARYNLAIIAFFG 
FFIVYALRVNLSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
HNQTGKKYQWDAETQGWILGSFFYGYI ITQI PGGYVAS KIGGKM 
LLG FG I LGTAVLTLFTP I AADLG VG PL I VLRALEGLGE G VTF PA 
MHAMWSSWAPPLERS KLLSISYAGAQLGTVISLPLSGI ICYYMN 

WTYVFYFFGTIGIFWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 




1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHLVAAARAA 
VTAETHPL PLLAPLAVCQS VKS PAACQVRPRPRAVALPAALGGP 
GRS L PGLTAATMSS FS ES ALEKKLSE DSNS QQ S VQTLS L WL IHH 
RKHAGPIVSVWHRELRKAKSNRKLTFLYLAWDVTQNSKRKGPEF 
TREFESVLVDAFSHVAREADEGCKKPIjERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSPPPKATEEKXSLXRTFQQIQEEEDDDYPGSY. 
S PQDPSAGPLLTEEL I KALQDLENAASGDATVRQKI AS LPQEVQ 
DVS LL E K I TDK3AAERLS KTVDE ACLRNRG PGTS 


5678 


3 


593 


SSSPPSSTPS L PLP F YLLLGQLRLQLLWGTAHLSG AG EAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SLAE FTEQFNQLHNRRNENLQLGPLGRDP PQE CS T FS PTDSGEE 
PGQIiSPGVQFQRRQNQRRFSMBVRASGALPRQVAGCTHKGVHRR 
AAAliQ P DFD VS KRLS LPMDI 


5^79 


2 


-423 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGfi "" 
DDAEVQQECLHKFSTRDYIMEPSIFNTLKRYFQAGGSPENVIQL 
LSBlfy TAVAQTVNLLAE WL IQTGVEP VQ VQETVENHLKSL L X KH 
FDPRKADS I FTEEGETPAWLEQMI AHTTWRDL FY KLAEAH PDCL 
MIjNFTVKVGRVLEIjRRKVTMNVYFWLLVCFL 


5680 


258 


592 


RRLTSTSEKLQNRKfSHTPLESLIHPQPSYKGFGIMFtiklCKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


56B1 


45 


869 


LLCAKTLGVRTKESQAE G YNRSG1 NNHQAED PR FC PS FCWM RSA ' 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFSQVMNMAAFLAL WAVLR V I Q LKPKVLNPWLN 1 SGLVA 
LCLAS FGMTLLGNFQLT1IDEE IHNVGTS LT FGFGTLTC W I QAAL 
TLKVNI KNEGRRVGIPRV I LSAS I TLC VG PLLHPHGPKHPHVCS 
QGPVGPGHVL 


5682 


39 


622 


PSRS CLGTMRKWRHREVNLPEVTQQDAVCPAPI PS PGLSAQTGL 
QK I WG T I HCQ VCPGAPAW PGS PWH EEMGLLLLVPLLLL PGS YGL 
PF YNG FY YSNS ANDQNLGNGHGKDLLNG VKL WETPEETL FT YQ 
GASVILPCRYRYEPALVSPRRVRVKWWKLSENGAPEKDVLVAIG 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GS CGATALI TRCLAWS VL I S RLAMATYTCITCRVAFRDADMQRA 
Hx KTDvmRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEES KGS 
ATYCT VCS KKFAS FNAYENHLKSRRH VELEKKAVQAVNR KVEMM 
NEKNLE KG LG VDS VD KDAMNAAIQQ AI KAQPS MS P KKAP PAPAK 
EARjNWAVGTGGRGTHDRDPSEKPPRLQWFECQAKKLAKHSEDD 


5684 


195 


677 


TWCFRGYLGPkViMXALDEPPYLTVSTbVsAKYRGAFCEAKIKT" 
AKRLVKVKVTFRHDSSTVE VQDDHI KGPLKVGAI VEVKNLDGAY 
Q3AVI NKLTDAS WYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 "' 


779 


1262 


LLLQQPWHCFLLFPPFRFSHHMIPGPPGPHTTGIPriPAiVTPQ 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANVVAECTLKESAAINQILGRRWHALSREEQAKYYELARKE 
RQLHMQLYPGWSARDNYVSPSS I PVALHS 




128 


1181 


CTWWQVNI Tl^INDNHPTWKDAPYYINLVEMTPPDSDVTTVVA " 
VDPDLGENGTLVYS I QPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF 
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to first 
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residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr espcndi ng 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
\rt=Aianine/ \ — cysteine, DaAspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
L-Leucine, M^Methionine, N-Asparagine , 
P=Proline, Q*Glutamine, R=Arginine, 
S»Serine, T=Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 








QNLPFVAEVLEGIPAGVSIYQVVAIDtDEGLNGLVSYRMPVGMP 
RMDFLINSSSGWVTTTELDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFPPAVYNVSVSEDVPR\GSGWSG*AARN 
ND VGLNAELS Y FITGGNVDGKFS VG YRDAWRT WGLDRE TTAA 
YMLI LEA I DNG^VGKRHTGTATVTJVTVT .m/ism vd qt tt neovir 


5687 


17 


917 


AAPPAPPDG/ PPP/PPPAPPT/PGPAAyAPASSCQPRLSAGRAA " 
QGDGGAAAVGHVLWPAVGP VR VNPGLQTPV PR P ELLPG P \ S SS 

LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRM PSTS AS E / AAGGOG APT PA if a q ptd d d a c rxvpc c> o > r»o r> 

LPPHLTGGPGMYSSEAKLPNSFSCLGLAGTGAGI*GTASAHGTG 
PPVLPHVCTPSbANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 
SS P FVA IGS CWLRG I P P PGSG FLC PGRAPG PVP I TTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDLFGNCYRLLKTdlEHGAMPEQVGVYWYS/CLYDSRKLF~ 
*SHMI IRSLL* KVIDDSLGQLPLLRELLL* *LNVTDRCI ILAYV 
LR VE KTFAI T YL KNFTVKVDFS LLGE I PLISMAAI LKL W I M KI D 
DGYIPAVF 


5689 


1504 


3 


HELSG KHI SMVSGNTCNWHPGGHS PGGGGQGE ITS KDRGE I PAL ~ 
IWA/RK?IGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG * GHAVG PGRE IGKEGS LPFLGPKAIiGF*SASCQRAFEGGAH 
GSTAR KPAPAT PGTRHPRTME TR EVAQG WPAGPRSQFWDQHPHS 
PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKQ 
KLPRTREPPLLQAGWAVRKP PWSEAKEGLGQAGRPSGMDS SAS \ 
PQTPGG RGSLEWG LPL YLG PHHDVK* RS DRLG * P P * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL * RVP PGSLG P S TQCK YE P TDKHS \GG ADAQLE VS TAGSRST F 
GQELKGPLDAGRLWPGAPSASSSHR*GG*ERARAGAGHRGST*A 
SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 
GSPPPA*ASAGRKGTVSTLGGGLL 


5690 


1424 


58 


PSPPAGVCAAPAPLPLLALARRDRRPCS PGAEAAPWQTGGPAID " 
GAWRTSVSALRRGATG/ APCSPGAEAAPWQTGGPAI DG\DGELP 
* VRSEE APRGCGAEGGGPGSGPVRR PGAGRGAHAGQGRQQDPEP 

GHAAAL PERTRG VAE P PAWAHAGS DAWRAGR * SQRT * ERAR PRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 

S LS LlLG P / P(3 AWN T .H T & T> Ann * xic n * r* T\VT3 r* * rv> t r n r\ r» « r» *>k j. 
~ " ""vj * / r'vartTLw uu a /tir'y UK. w nt»Jr w (jUisXuAirG VAGEDPR P P * 

gnfvr*lllmp/gva*rhgtsp?lgpslgenggqmdsgnlfgtp 
kg*shpaftkst*smeaeksywnhphr\drgrqgvrinclrvge 

S EMWGP YSAPRPGTVFLSS FLS PASEEH\PEGSSS FNTPFPPAG 
PEGDPGLNSPGLLP 


5691 


107 


5S0 


ISNDPSPGYNIEOMAKRGKKLVKtTPVTVlfr2Mm/CT7cr ,1 rr cct on — 

VAHRMLATGECTPEDLCFSLQVMQ*KTGTESWG*RFYIVEQN*S 
GDAPLI FS P YLSLTGNCG FAMLVE I TERAMAH\ CGS PGG PSLWG 
GVGVYVLLESVPLSYS 


5^92 


1193 


548 


T** • • *** * ■ »w w ▼ m ) i\m t jj oi\ \j r r jl ivvjo JltrXJ \yg V ir LiXUlV 

PS I FSSYPI /GLPQSGGEPGP VGEQQPVRRPEQ PSCGPASRMPL 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNLPVMGATRSNLQPPRKVAVPGPTR* RDQDS KQDFSS KPLQS 
VPGIASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 


5693 


1251 


1330 


ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP * 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGI*SAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAI VWRSS R FPLW F PLRCCFW VS GFKDPNP VLRFF 


5694 


3 


1338 


GS KB PARSLHKRGSGHKS SAG KWGS VTL STAGALG * K^LHQ * WT " 
QRCL\NNLS S EEFNASS S LNSLPST PTASRRNS T I VLRTDS EKR 
SLAESGLSWFSESEBKAPKKLBYDSGSLKMEPGTSKWRRERPES 
CDDS S KGGELKKPISLGHPGSLKKGKTPP VAVTS PITHTAQSAL 
KVAGKPEX3KATDKGKIAVKNTGLQR5SSDAGRDRLSDAKKPPSG 
IARPSrSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
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Amino acid segment containing signal peptide 
* a ^-^.yateine, U=ASpartlC Acid, E«= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, MsMethionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine f ToThreonine, V-Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *sStop 
Codon, /-possible nucleotide deletion, 
\«poseible nucleotide insertion) 








KPVNGR KTS LDV5 NSAEPGFLAPG ARSN I Q YRSIj &R PAKSS S MS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
P AP VNQTDREKEKAKA.KAVALDS dni s l ks I gs pestpknq as h 
PTATKLAE LP PTPLRATAKS FVKP PSLANLDKVNSNS LDLPS S S 
DTTQCI 


5*95 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDS£KR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGT^KWPpppdpq 
CDDSS KGG ELKKP IS LGH PG S LKKG KTP P VA VTS P I THTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KPVNGRKTSLDVSKSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 

PTATKLAE LPPTPLRAT AKS FVKP PS LANLDKVNSNSLDLPSSS 
DTTQCI 


5696 


3 


1338 


GS KE PARS LHRRGSGHKS SAG KWGS VTLSTAGALG* KQLHQ * WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 

SIiAESGLSWFSESEEKAPTC > in'.RVn<If3QT VMT? T> C VT.1T3 n n»n nor 

CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KPVNGR KTS LDVSNS AEPG FLAPGARSNIQ YRSL PRPAKS S S MS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTDRE KEKAKAKAVALDSDNI S LKS I GS PESTPKNQASH 

PTATIClJu^LPPTPLPATAKSFVKPPSIANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


* xvr\o J. iOXvu£\jlo etAI iSA/vlrfc'irtrEPVPAA 

QG PATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEEEAEVAAPTKGPAPAPQQCSEPETXWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGPIAAQMLSFVMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKF<!<5FT?rv 

EGKTPSKENKKKXKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GAEAAE PQEDL PPL SQS SR FFQEQQ KMKTKSLGP VS FKD VAVDFT 
QEE KQQLO PEQ K I T YRD VMLENYSNLVS VG YH 1 1 KPD VI S KLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPSRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
NASSBYISSDGRYARMKADECSGCX3KSLLHIKLEKTHPGDQAYE 


5699 


2 


1448 


RVRQPPGLWVRRTVPAMQCPAGLSRVPGVAG/DPSLPSFRGPRD 
EAAHRGTIQTARHTRKLYVQGPASGPPLPRVSTQVAI*DEKPLA 
RPS/GRTNAPFPQGQKPAG KAAPGPAAAGRVAMR\ PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL+RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS *HLDPNT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
V P I LFQNPSGALRS RRTEPAGWVP PTRHE+ DDG*TAAPAS GGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSU3CR/ SMLPASSGPPPAPGPRRLAAGAHTSASARCPPAAAA 
G WQPRR PG FAGRAALPGPPHP PSS * RELGGLPGPGW + TLDPLPA 

HPAHPPGSAPPWGALGGWAAARASLPWSPSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


£97 


NGHKGVWEIKZY*RRSNIHKNSKS£SHLNQDHSFPPPTPNSAR"S 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 


410 


IFEKICSDTQEFISPEINPQICSWLtFDKGAK/NriATGKDSLFN 
KWSWKKWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cvsteine DoA«mai — in Ar-iri t?_ 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P«=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, VsValine, 
W«=Tryptophan, Y-Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR ' 

ASESSASSDGPHPVITPSRASESSASSDGLHPVTTPSRASESSA 

SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 

PVITPSRASESSASSDGPHPVITPSWSPGSDVTLLAEALVTVTN 

IEVINCSITEIETTTSSIPGASDTDLIPTEGVKASSTSDPPALP 

DSTEAKPHITEVTASAETLSTAGTTESAAPHATVGTPLPTNSAT 

EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 

PVS I EAGSAVGKTTS FAGSSASS YS PSEAALKNFTPSETLTMD I 

TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 

PTATPTTARTR PTT\ A* VQVKNE VSSSOG* VWLPRKTSLTPEWQ 

KG*CSSSTGNSTPTRLTSRSPYCVSGEANG/PSAAARHVPYAKR 

GCCP*PGPPPTDCSCVTVLRGTQKVPMKGSMSKPLTPDVATGPS 

LTSTGVYVWGGASPVPRGVLGLTLAHVLCFSKEKT 


5703 


14 


1117 


HHKDS RSOGliPRTOECAR PELH P T . TiPPE A T.Wdvpp t c vd n a un'n 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RILRWHP*HTAAR*PRWRRIjPSSHRWTRHLGVLRVQDKS**VSL 
DPSCRPRFLRTC* * YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS * WCPWL+ AAP WTGWPTAQRA QIht nDamo n e » utv 

RRVAGLL PGQGLTVRR * H * TAGAPAS VRS SQGATRS PAPGGDQC 
ACGRGPSSC+HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNIKDGVCAOIEKNPARAKWKTfAV»VTTTJVj«T?T dadi? 

QS S TAAAQS ASATDTAT PG AAGGATAAAASGATSAPEGDAARAA 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5705 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNIKDGVCAOIEKNFARAKWKKAVPVTTr mvtdt onoir 

QSS TAAAQSAS ATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KSDNVAPRRP *LPPQPQME VPPQPLMAVS PQ PPMEASLQ P LMGE 
SPQP 


570S 


1161 


610 


QLGRF XAQDT VA I RKVK E VFGTG AMRH W I L FTH KE D* GGQALD 
DYVANTDNCS LKDLVRE CERRYCAFNN WGS VB EQRQQQAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLLHYE IFVFLLLCS I 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MFAIQPGLAEGGQFLGDPPPGLCQPELQPDSN'SNFMASAKDANE 
NWHGMPGRVEPILRRSSSESPSDNQAFQAPGSPEEGVRSPPEGA 
EI PGAEPEKMGGAGTVCS PLEDNG YAS SSLS I DS RS SSPEPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 

c 


3FSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGQPRPYLDLPA 
QASVSRPHDRA*GEAVSLSLSSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASI*LPPSFSLGAPGAPSPL 
RVS PASGGPR KEGRQGSGG * AGGGGP \ ARTHADL PCVG F VCS P P 
LLK* SDS PVKQLPA\SGQGSGAGMPPVGS SDILRPRPTSVSGTG 
RAAG * CS WOPAACCTPRS O * WAVAR <Z P c? P. PQP w * p n enu * T>n. * e 
S RRR RGP * AAGRS TPAVP * P CS * GGAGRRAYACRTGWG YAPS R * 
LEPSGPTSGSAL* TWASHSTGA* ♦SRLCGTAGTGPLCSQSSRS * 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGSRCVCTCTGLPCPG I PLSGAS PGGSGETGAGRSHTLK 
AARS RLS PRPG SGS RGS Y* S HNDNWGT W P APPSAGHLLVGG * US 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2C31 


I TLC P L PQTEKCliN WTEAATP LG I YLKARVEAGGLKELiEI S WG 
LHQI VVRWGAVVMRAGMGGCRCWGVMAP FAPR/NALS FLVNDCS 
LI HNNVCMAAVFVDRAGEWKLGGLD YM YSAQGNGGG PPRKG I PE 
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ID 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
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Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M»Methionine, NsAsparagine , 
Pa proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LEQYDP PELADSSGRWREKRSADMWRJjGCLI WEVFNGPLPRAA 
ALRNPG K I PKTL VPH YCEL VGANP KVRPN P AR FLQNCRA PGGFM 
SNRFVETNLFLEBIQI KEPAEKQKPFQELS KSLDAFPEDFCRHK 
VLPQLLTAFE FGNAGAWLTPL FKVGKFLS AEE YQQKI I PVWK 
MFS STDRAMR I R LLOOMEOFI O YLDE P TVNFTOT PPHVUursPT.TVr 

• m ww a *<rcw u\j.rvuuvyi '"VI V * MUSIC 4. V 11 ;yi f XT li V VflvJ 1! lHJj 

NPAIREQT VKS M LLLAP KLNBANLNVE LMKHFARLQ AKDEQG P I 
RCNTT VCLGK I GS YLSAS TRHR VLTS AFSRATRDP FAPS RVAG V 
LG FAATHNLYSMNDCAQ K I LPVL CGLT VD PE KS VRDQAFKAI RS . 
FLSKLESVSEDPTQLEEVEKDVHAASS PGMGGAAAS W AGWAVTG 
VSSLTS KM RSHPTTAPTETMIPORPTPEGVPAPAPTPVPATPT 

tsghwetqeedkdtaedsstadrwddedwgsleqeaesvlaqqd 
dmstggqvsrasqvs\tpttnppnpqsptgaagk\rgllgtgla 
gaklpgats * r ytagqrv 


5710 


1 


562 


VLQMLDTVRVLFSKGPFIAI FASDPHI I IKAINQNLNSVPSGFK 
\LNGHDYMRNI VHLPVFLNSRGL/RQ/LQENFS * LQQQMBTFHA 
QILQGYRKKLTEEFHRTALGR*QNLVARQPSIDG*DAIGFELYV 
C IA I QFNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQI AKAVLSQQR PS LFHE CAFHF PS * S LQRHT INLDQGI F * LLM 
LSEERQHLFESS/IWTTPHNLK*/FEIHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSLDIS ERLECFLLTLDCVDDTL I VLAEEHGCLDI I KELP 
ETVI DI .T .NKCLTFHPSKRPTPDELMKDKVFSE VS PLYTPFTKPA 

fiLVQQRT.PnZVTM.TT.DirnT GfXT ntmTKTMTWT sobctpoi/wt t.i/it 
our ooaijK^ALrij i ijfejj XoyijOivJJ J, £4£iUi Li/UiKii iEKVYlLWCIj 

AGGDLE KELVNKE 1 1 RS KPP I CTLPNFLFEDGE S FGQGRDRS S / 

TFR*YHWDIWMPAKK*IERCWGRSIIiPITLKMTSLILPYSNSN 

EARVD I PP LMRGLT WAALLG VEGAI HAK YDAIDKDTP I PTDRQ I 
EVDIPRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLVYWQGLD 
SLCAPFLYLNFNNEALVYACMSAFIPKYLYNFFLKDNSHVIQEY 
LTVFSOMIAFHDPELiSNHI.NETGFTPDIiYATPWPT,TMB*7 T W\/T7DT 
HKI FHLW \ DTLLLGEFLFPI I»YWE 


5713 " 


<J34 


284 


PVCAVPVDRWPVLPREDQEGQQL*AKLPRDFRR* FQI LGPMEGH 
TACRCSRRGAQVQHLPREDIRAAE*DPHLREVWPaLPTSSATSP 
* RAVLTS PCSHLGS ADAASSHWLCGVS FH 


5714 


212 


613 


WGLGU5PTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KSAWAAAAPASVADDTPPPERRNKSGI ISEPLNKSLRRSRPLS 
HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


S71S 


131 


1975 


esasqqkrskcliltlklelsgsapkktsarpgsslwlpphsqe 
qtppasklqggggglqtx3wglhp vpvtaas plprwciifgavak \ 
glpgp*i*cpsgaa/gglqrgpglsplgaagkvsclhppsmvenn 
dstchehhegilaarvtpvp\sgkpgrvlkppgrvcrpphpaas 
prppgs/sdldgprpqmhlrafpaahggpvntphggeektfmss 

QIRRKETKPL* RKTPAG\NNYQSNS I PVSQSPQLTVDLLPSAGR 
TQAPSGRGDAGKPTPGHGV r,PK AQVTI/rPNr , prQT.nr , r , n* dqpt 

YPKTPKQRRWRRPL/LLGPSQ *GSRQS TC+ EV\GALGEPVRI PG 
L*PDLSCILSNGSKHRREGLSFPRSLGPGRRGPAGLQSLGCSPT 
PKNTACHS SGHVALOAGHDS ABDVGSGHVAIaD Af5HD QTfinvrro d 

VWRWIPLE * LGLSRETGQATRRGLVWIS PGRAAAACVACAQALE 
EG PLRLPGQ DRGAQ PCSHCPGRAAGQ PB PGAG APCRE /GG * DP T 
GLT/GVPGTDPKRGGRKPGQSGQETQGPTWSGPESPLQPKP*E 
RQE/VGAGASSGVGLSRGRAGGPSSAWBVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 ■ 


1711 


" " 1370 " 


R VFS LLCEG PGHC YQGAVCR EACAAASPG L DSAAEPHRLCEHTD 
* LPK*GPGY IQHFHCDSN ILCILYKISFNLFSYS F *GVARYAC * 
RCPLVL*SGFFTIIVGGYSCCMPLKT 


5717 


44 


1489 


lptealresewVseygkcgprglVpegestsplPs"svdtedsld 
egpgalvlesdlllgqdlefeeeeeeeegdgnsdqlmgferdse 



363 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine c=C\/i\~ n~,zY<«v\ •>%-»-<« — t\*.j -a « 
* v*— \.yaL.xsxiit£ , u=»>\sparcic ACIQ, B» 

Glutamic Acid, F^phenyl alanine, G*Glycine, 

H»Histidine r I=Isoleucine, K-Lysine, 

L=Laucine, M=Methionine, N=*Asparagine, 

P=Proline, Q=Glutamine, R^Arginine, 

S«*Serine, T-Threonine, V=*Valine, 

WoTryptophan, Y=Tyrosine, X= Unknown, *=Stop 

Codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion) 








GDS LGAR PGLPYGLSDDE SGGGRAIi5ARQFVP7PappDn^npr»p — 

RPGPACQLCGGPTGEGPCCGAGGPGGGPLLPPRLLYSCRLCTPV 
SHYSSHLKRHMQTHSGEKPFRCGRCPYASAOLVNliTRHTPTHTr 

EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPBDALLLPDLSLHVPPGGASFLPDCG 
Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGBKPYKCPL 
CPYA03NIANLKRHGRIHSGDKPFRCSLCWYSCNQSMNLIRHM 


" 5718 


120 


284 


VAHALSLPAESYGNDVSMTHPQLPPTOLAWDLCRTCLPLSYNFT ' 
S**STADPLHLi 


5719 


48 


428 


ELNNGPFQM^LCNGGNLAVTGSWADRSPLHKAASQGRLLALRTL 
LSQGYNVNAVTLDHVTPLHEACLGDHVACARTLLEAGANVNAIT 
I DG VT PLFNACSQGS P S CAEL»LLEYGAKAQ P\ESCLPSP 


5720 


1 




\RCT\YYE\TCGGTYGLQMWSVSFQDVAQKWAL\RKKQQ\LAI 
GPCK\SLPN\SPSH\SAVSAASIPARAPINQGHS/SGGGSAFSD 
w^** » *^ j Cj 1 -L/vioo 1 FTtr IRKQSKRRSNIFTS 
RKGADP\DREKKAAGCKVDSIGSGPAIPIKQGILLKRSGKSLNK 
EWKKKYVTLCDNGLLTYHPSLHDYMQNIHGKEIDLLRTTVKVPG 
KRLPRATPATAPGTSPRANGLSVERSNTQLGGGTGAPHSASSAS 
LHSE R P LS S SAWAGPR P EGLHQRS CS VSSADQW S E ATTS L P P GM 
QHPASG 


5721 


97 


492 


RHS S PCCSLRRTERSSNAAVST / TT VQQFKR Fl EN YRRHIGCVA ' 
VPYAI AGGLFLERAYYYAFAAHHTG I TDTTRVG I XLSRGTAAS I 
otiiro l iJJu;nLKi^iji xr JjKiil r IjNKxVPr DAAVDFHRLIASTA 


5722 


88 


1043 


VAL D VUAGS S PGGGMAGALLG PRVHG I RAVLP%VARGG VQAPG AP 
GSLGVSHAAAPPARPQGAAQSPHRGRRKGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGOAGLLGRQGQGGRGASRERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PP PP PHLGALTAGSGEERQS QPRAETLRLGRGAPLP\ PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5723 


88 


1043 


VALDVLAGSS PGGGMAGALLd PRVHG 1 I RAVLRVARGG VQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPP PHLGALTAGSGEERQSOPRAETLRLGRGAPLP\ PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGIAGT 
KSSTREIPEMI 


5724 


3 


1841 


FTNEAP PAP LPDAS AS PLS PHRRAKS LDkRSTEP S VT PDLLNFK 
KGWLTKQYEDGK5WKKHWFALADQSLRYYRDSVAEEAADLDGE ID 
LSACYDVTE YP VQRNYGFQ I HTKEGE FT I*S AMTSG IRRN W I QT I 
MKHVHPTTAPDVTSSLPEEKNKSSCSFETCPRPTBKQEAELGEP 
D P EQKRS RARE \ RRR EGRS KTFDWAE FR P IQQALAQER VGG VGP 

ADTH\DPWRPEAEHGEIjEHHPAT?RRT7PPPVPC^MT l^Trvnnmn 

DAALRMEVDRSPGliPMSDIaKTHNVHVEIEQRWHQVETTPLREEK 
QVPI AP VHLSS EDGGDR LSTHEL TSLLEKELEQSQKEAS DLLEQ 
NRLLQDQI.RVALGREQSAREGYVLQATCERGFAAMBETHQKKIE 
DLQRQHQRELEKLREEKDRLLAEETAATI SAIEAMKNAHREEME 
RELEKS QRSQI S3 VNSD VEALRRQ YLE ELQ S VQRELE VLS EQ YS 
QKCLENAHLAQALEAERQALRQ CQRENQE LNAHNQELNNRLAAE 
ITRLRTLLTGDGGGEATGSPLAQGKDAYELEVPSGARPCLTQLC 
TQEPQGSAAWPLSYRWGGTDLRQQESQGPGRSKSPEGGEEQ 


5725 


3 


1049 

J_ 


VNGHS EE TSQa PNRTEPHDS D CS VDLG I S JCS T3 DLS PQKSGPVG 
SWKSHSITNMEIGGLKIYD1LSDN\DLSSHLQPLK/FTSAVDG 
KNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
PE/GAKYNKRPHKWAHNLHLKYMVLHS1ISNTVAV\RSQRHFVA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eofcirtp 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
iH-/uanme, L.=cysceine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I«Isoleucine, K-Lyeine, 
L-Leucine, M-Methionine, N«Asparagine, 
P^Proline, Q«Glutamine, R=Arginine, 
S=Serixie. T=Thrponi n*» v-Val i no 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LOT KS PNRPCOFS SS APS / VDO HAD / TMO <3 V & ktj c hmmwi? cmtjw 
NVRANTAYHLHQRLG PARHGBKWAI S PNDRL I PAVTRS T I QRQS 
SVSSTASVNLGDPGSTRRAOIPEGDYLSYREFHQA(THTT>PMMDr 

SQRPLSARTYSIDGPNASRPQSARPSINBIPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


j SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVKRLfeLHGRPLGAP 
. TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
j QGSPGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLGQARTPP YLQLQVTEKQVLLRADDG 


5727 


21 


221 


xs.tr a u a ucvoti ivK.ui'WrtiAji/iisvXiVAliltii *"HWEDQASCEVIiTVKKK 
AGAVTSTPNRNSSKRRSSLPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG/LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRNPTS/GPPSQ1GEGAEQGDEGVADAPQIQCKN/GAEDPPAEI) 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHLAEGGA 
KGSPRRLADPQDbPAGQMSLAPPFPPVAAVIRSNX 


5729 


1 


1525 


AGGAREVLTLULGHFAGFVGAHKWNQQDAALGkATDSkE"PPGEL " 

CPDVLYRTGRTLKGQETYTPRLIIiMDLKGSLSSLKEEGGLYRDK 

QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 

KSIPNGKGSSPLPTATTPKPLIPTEASIRVWSDFLRVHLHPRSI 

CMIOKYNHDGEAGRLEAFGQGESVLKEPKYQBELEDRLHFYVEE 

CDYLQGFQILCDLHDGFSGVGAKAAEIiLQDEYSGRGIITWGLLP 

GPYHRGEAQRNIYRLLNTAFGL.VHLTAHSSLVCPLSLGGSLGLR 

PEPPVSFPYLHYDATLPFHCSAIIATALDTVTCS\YRLCSSPVS 

MVHlA ADMLS FCG K EGA/TAG A 1 1 P F PLAPGQS L PDS LMQFGGAT 

PWTPLSAOGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 

u^wkw a AuroAAiiM^f i iJUwuyi'i*vMbo^nAjljjjl\PCKVAPPYPHIjFS 

SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERLLANQQVFHISCFRCSYCNNK 
LSLGTYASLHGR1YCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENQGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSS SREKGGQAS WN P KLRVA 


5731 


122 


443 


RSHRGELIPKDSCYMRKPPRRPkKRRQG/CALPQGCLTFKDVAI 
EFSLEEWKCLNPAQRALYRAVMLENYRNLESVGLTSKDSKYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


rrijn^vudrnnivoKKivftnv i>vijjvi»vr 1 ororStiPIjiuCGCIjRF 
PERTCS QIiQQADWAPDFG PS S F VPS WGATATGARKFLI AFN I \N 
LLGTKEO^RlALNLREQGRGKDQPGRLXKVQGIGt^YLDEKNJLA 
QVSTNLLDFEVTAXHTVYEETCREAQELSLPVVGSQLVGLVPLK 
ALLDAA 


5733 " 


1 




PALQEVWAMALAWGKQYENDARTLFBFTSGVNDTES P 1 1 YRDES 

MRTACSPDGLCSDGNGLELKCPFTSRDFMKFRLGGFEAIKSAYM 

AQVQYSMWVTRKNAWYFANYDPRMKREGLHYVVIERDEKYM\AS 
FDEl\VP\EFIGKMDEVT, c ?PnPM 


5734 


3 


968 


RCNSPESLTSLbVLLTTANNLFVLIPAYSlOJRAVAtFFIVFTVi: " 

GSLFLMNLLTAIIYSQFRGYLMKSLQTSLFRRRLGTRAAFEVLS 

SMVGEGGAFPQAVGVKPC3NLLQVLQKVQLDSSHKQAMMEKVRSY 

GSVLLSAEEFQKl,FNEIiDRSWKEHPPRPEYQSPFLQSAQFLFG 

HYYFT5YIjGNIiTAT.aTJTA7Q TH/PT vt nnrwrr DnnDnnormrT VT « 
«** ** w*i«RWfiJA/\i4/ii>iAjv3i.v-vr 1j v i-iiJAi>vi ^"Ap'kPPFXIjGIIjNG 

VFIVYYLLEMlxLKVFALGLRGYIiSYPSNVFDGIjLTVVLLVLEIS 
TL\VCTDCHTQAGGRRWW/RLLSLMDMTRMLNMLIVFRFLRIIP 
SMKPMAWASTVLGL 


5735 ■ 


2" 


540 


FFTPCVARAFNFPDQATVKKAAYSLPRVGGGTSCGLPQARRISL 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNTIAGRTYNDLNQ 
YP VFPW VLTNYES E ELDLTL PGN FRDLS K P I GALNPKRAVFYAE 

RYBTWEDDQSPPYHYNTHYSTATSTLSWLVRI VS I FIELACLWY 
LKILT 


5736 


1 


382 


GTRPS T KKSG YSPQQ VA VI H CKGHQKENTAVAHSNQKADSAAQ V "" 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
irw-rtxanitic, ^-v.ysteine, ii=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S-Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLSVTPPNLLPTVSFPQPDLPDNPVYSTTTEKLASDLbANKN 
QES**ILPDSGIFIP*T*TSYLQSTTHLRRAKLPQLLRR 


S737 


290 


1041 


KACLHLLS 5 F LTSNF LFNPLL PDS hYS VEARSQRANLG PCRRKR 
LQTLMRLAAG FQYSSHKDPSLSAKEKKTDYHNEARGPWPGWVG * 
RTADGS CGRG PDGAHH PGPKSSS WRAS RLLPGLGGS HHL DAYVG 
RDLECGT P A PLQLE I P PQ PRGHPAP I PTGQAG PRDS G PG AS P * V 
ETRPLTDGRR * PGVRPVGWTPAHPAGTLRPRGAVE PSVSACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


B 


4*6 


DTLSLNCTLPETLPMTPSF*LSFL*PPGI*ARAKSIPTKTYSNEV 
VTL WYR P PD IhLQS TDYS TQ I DMW * GQVE VWQGPCG KGGGL VTT 

ATQPAAFLFTVPSLPRGVGCIPYEMATGRPLFPGSTVEEQIiHFI 
FRILSEEAWALCAVETHR 


5739 


1 


1222 


S FQRRG I RWNVHTLHPHPRAVWAGl GRGHGS * ALLGRARAPALcH 
FPTLLEFLESLEPDLPALBaMftT.WT.WhnrcDranjDiir'TeriT t >m r I 
SAEVDGP VPGYLSS PQS ITDTCLY I FTSGTTGIjPKAAR I SHLKI 

LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGOFWEDCOOHRVTVPOYTfJRT.rnJVT UMODDcvhDap 

KKVRLAVGSGLRPDTWERFVRRFGPLQVLETYGLTEGNVATINY 
TGQRG AVGRAS WLYKH I F P FSLI RYD VTTGEP I RDPQGHCMAT S 
PGEPGLLVAPVSQQSPFLGYAGGPELAQGKLLKDVFRPGDVFFN 
TRDLLVCDDQGFLRFHDRTGDPFRWKGENVATTEVAEVFEALDF 
LQEVNVYGVTV 


5740 


265 


231 


PAYVJItkVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
YVYERVYN*NISRMVHALEQKRHPAGLSSSMALQLNPCLGMLMA 
LQSELHKLYDEETQSWVSGSACGGYP j 


5741 


1 


650 


PRKTMRRGVLMTLLQQSAMTLPLWIGKPGDRPPPLCGAIPASGtlH 
YVARPGDKVAARVKAVDGDEQW1LAEWSYSHATNKYEVDDIDE 
EGKERHTLSRRRVIPLPQWKANPETDPEALFQKEQLVIALYPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTSYADGYSPPT,NUanpvw 
ACKE PKKK * CRliADS PS PNDTGQDSRGRAG I KH I PPLKKK | 


5742 


2 


362 


TQSVKE ILKRNPNVNLTDKDGNTALMIASKEGHTEI VQDLLDAG| 
TYVNIPDRSGDTVLIGAVRGGHVEIVRALLQKYADIDIRGQDNK 
TALYWAVEKGNATMVRDILQCNPDTEICTKDG 1 


5743 


2 


415 


GKTPE G I DA I EE I E I DLEETEREIS P QENGLE E VK PLG EMQTDL 

KATGREI S PREKTPEVIDATFPTDTmT.TTPTr'T} D t? T C nvtrXTr-- n-nn 

VKPVDEMETDLKTTGREGSSREKTREVIDAABVIETDLEETERE 
ISPQE | 


5744 


3 


703 


TRRTTTTSPTTTRQMTTrPAALPTTVVTTPDLTTGTPLQMTTIAH 
VFTTANrCLSLTPSTLPEEATGLLTPEPSKEGPILTAESETVLP 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGI PMvS MKNEMPISQLLMI IAP 
SLGFVLFALFVAFLLRGKLMETYCSOKHTRLDYI GDS KMVT wnv 
QHGREDEDGLFTL 


5745 


1400 


599 


gksrfwlmkhskktydsfqdeledyikvqkarglepktcfrkfH 

KGDYLE TCGY KG E VNSRPT YRM FDQRLP S ET IQT YPRSCN I PQT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 

yiogshgvshrvykhfssdnststhqashkqihqkrkrhpeegr 

EKSEEERS KHKRKKS CEE IDLDKHKSI QRKKTEVE I ETVHVSTE 
KLKNRKEKKSRDVVSKKEERKRTKKKKEOGOFRTFFFMT urine t 
LGF | 


5746 


3 


B21 


S?ASGRIjTPSSPAFDGELDLQRYS^PAVSAWSIX3MGAVSWSES| 
RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 

leerallrearlgrarssggmqatpateglarpqapsssafrcp 
yckgkfrtsaererhlhilhrpwkcglcsfgssqeeellhhslt 
ahgaperplaatsaapppqpqpqpppqpeprsvpqpepepqper 
eatptpapaapeeppappefrcqvcgqsftqswflkghmrkhka 

SFDHACPV 1 


5747 


2 


1328 ' 




DRHVETLCIHFLGPSTGSTAKTGGRNWLKTCNCLYGNTC^mGH 

psprgkgyssnyrrsperptgdlreriknkrodvdtepqkrnte 
bssspvrkessrgrhrekedikitkertpeseeenvewetnrdd 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F=Phenyl alanine, G=*Glycine, 
H«Histidine, Islsoleucine, K^Lysine, 
L= Leucine, M=Mer.hionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
SaSerine, T»Threonine, V«Valine, 
W»Tryptophan, Y~Tyrosine, X= Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDINYDYVHELSLEMKRQKIQRSLMKLEQENMEKREEIIIK ' 
KEVSPEWRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 
AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
KYKVKDR I E E KTRDGKDRGRDFERQR EKRDKPRS TS PAGQHHS P 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
AS P Y PS HSIiS S PQRKQ S P PRHRSPMR EKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 
RE 


5748 


934 


473 


SEGPQVFVKGLAPTLX At FPYAGLOFSC YSS LKHLY KW A T papr 
KKNENLQNIXCGS GAG VI S KTLTY PLDLFK KR LQ VGG FEHARAA 
FGQVRRYKGLMDCAKQVLQKEGALGFFKGLSPSLLKAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


j 1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS 
SASSTYSSAEERMQSEQ IRKLRRELES SQEKVATLTSQLSANAN 
LVAAFEQSLVNMTSRLRHIAETAEEKDTELLDLRETIDFLKKKN 
SEAOAVIOGALNASETTPIfET 1 nT?fPnM , CQnCTCGr mcttouoct 
GSSKDADA 


5750 


22 


866 


IFI S I CLWNAHLCFLLLP KDCIDQ VMKLQNLFVDDS GR YLA I Q F 
IILBWAYVFLYYYE YRKAKDQLDIAKDI SQLQI DLTGALGKRTRF 
QENYVAQLILDVRREGDVLSNCEFTPAPTPQBHLTKNLELNDDT 
ILNDI KLAD CEQ FQMPDLCAEE I A I ILGICTNFQKNNPVHTLTE 

TQALADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTSSALQIEEKtEMWB 


5751 


3 


751 


SCGSALRAWRCGAAALATFPAPALPGLMYRALYAFRSAEPNAIiA 

FAAGETFIjVIjPR £ ? < ?AKWWT.aaT3n'PQrtc , Tr , v\7Do&vT ddt/^^ t 
innufci. uv iJEii^0An¥irVJJftHAntwuJ& Jk\jX VrirAX JjKKLiQGIjEQ 

DVLQAI DRAI EAVHNTAMRDGG KYS LEQRG VLQKL I HHRKETLS 
RRGPSASSVAVMTSSTSDHHLDAAAARQPNGVCRAGFERQHSLP 
SS EHLGADGGL FQ I PLPS SO I P PQPRRAAPTT P PP P VKRRDREA 
LMA5GSGGHNTMPSGGNS VS SGSS VSS CI 


5752 


3 


471 


GP VCGVGLS VAWAG PWRG# VHS VGGGGRAALHGAEL PCLSGAAT 
VEREMELRHKNEMLRVETEARARAKAERENADI IREQIRLKASE 
HRQTVLES IRTAGTLFGEGFRAFVTDRDKVTATVNT PTvnrrwnv 
AERQHVGAS WS PRSCPCRLCTAL 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRNIYTPRTGHRIRKLDQIQSGGNYVAG 
GQEAFKKLNYLD IGE I KKR PME WNTEV KP VIHSR I NVSARFR K 
PLQEPCTI FLIANGDLINPASRLLIPRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHVVEFAGEHAEAIASREQEVLQGWKELLSACEDARLHVSST 
ADALRFIIS 0 VRDLLS WMDGIASOIGAADKPRCPSSLIjfli .i>A qdw 
WPTPATP3PLTAPFSME 


5755 


3 


888 


LGDQFYKEAIEHCRSYNSRLCAERSVRLPFLDSQTGVAQNNCYI" 
WMEKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDPKLRLLEI K 
PEVELPLKKDGFTS ESTTLEALLRGEG VEKKVDAREEES IQE I Q 
RVLENDENVEEGNEEEDLEEDIPKRKE'JRTRGRARGSAGGEIRRHD 
AASQEDHDKPYVC0ICGKRYKNRPGLSYHYAHTHLASEEGDBAQ 
DQETRS P PNHRNENHR PQKG PDQTVIPNNYCDFCLGGS NMNKKS 
GRPEELVS CADCGRSAHLGGEGRKEKEAAA 


5756 


3 


621 


SSICtiQALFAHPIiYNVPEEPPLIjGAEn c ?TiTjA < inPAT.PVVD'DV\7aD " 
WKRjmKMYREQMNLTSLDPPLQLPXEASWVQFHLGINRHGLYSR 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRI LDFRRVPPTVGR I VNVTKEI L 


5757 


3 


473 


YKDALLLPDNHROWFENGTLKLTDVQKGMDEGEYLCSVLIQPQ"" 
LS ISQS VHVAVKVPPL I QPFEFPPASIGQIiLYI PCWSSGDMP I 
R ITWRKDGQVI I SGSGVTI ESKEFMSSLQ I S S VSLKHNGN YTC I 
ASNAAATVSRERQL I VR VP PR FW 


5758 


1 


474 " 


FRRGAGAERGEHREGERGAAGMGEFKVHR VR FFNYVPSGI RCVA 
YNNQSNRLAVSRTDGTVEIYNLSANYFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGE IME YDLQALNI KYAMDAFGGP I WSMAAS P 
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seo 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of. 
amino acid 
sequence 


Predicted end 
nucleotide 
x ocucion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A-Alanine, C= Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HaHistidine, I»Isoleuclne, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
SsSerine, T=Threonine, V~Valine, 
W=Tryptophan, Y«Tyroeine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5759 


2 


1240 


SGSQLLVGCEDGS VK1>FQITPDKI PV 

gnaafagqgvvyetfhmsdlpsVWngtvhVvvnnqigfttdpr 
mars s p y ptdvarwnap i fhvnaddpbavi yvcs vaae wrntf 

NKDVGADLVCYRRRGHNEMDEPMFTQPLMYKQIHRQVPVLKKYA 
DKLIAEGTVTLQEFEEEIAKYDRICEEAYGRSKDKKILHIKHWL 

dspwpgffnvdgepksmtcpatgipedmlthigsvassvpledf 
kihtglsrilrgradmtknrtvdwalaeymafgsllkegihvrl 
ngqdvergtfshrhhvlhdqevdrrtcvpmnhlwpdqapytvcn 
s s ls e yg vlg fe lg yamas pn al vlw eaq fg dfhntaq c 1 1 dq f 

istgqakwvrhngivlllphgmegmgpehssarperplqmsndd 
sdaypaftkdf3vsql 


576C 
57<2l 


1 


1221 


VRDITSDSLSLSWTVPEGQFDHFLVQFKNGOGQPKAVRVPGHED 
GVTI SGLE PDHKYKMNL YG FHGGORVGP VS AVGT iTapr if n pttm a 

PASTEPPTPEPPlKPRLEELTVTDATPDSIiSLSWTVPEGQFDHF 

lvqykngdgqpkatrvpghedrvtisglepdnkykmnlygfhgg 

GRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SSPDSLSI^WTVPQGRFDSFTVQYKDRDGRPQVVRVGGEESEVT 
VGGLE PGRKYKMHLYGLHEGRRVG P VS T VG VTAPQEDVDET PS P 
TEPGTEAPEPPEEPLLGELTVTGSSPD3LSLSWTVPQGRFDSFT 
VQYKDRDGRPQAVRVGGQESKVTVRGIiEPGRKYXMHLYGLHEGR 
RLGPVSAIGVT 




3 


1275 


SCDMAEAAALVWIRGPGFGCKAVRC^GRCTVRDFIHRHCQDQN 
VPVENF FVKCNGALINTS DT VOHGA VYS LE PP r >rczr \cnr r^r c mt 

RALGAQIEKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAASS KM VSAE I SENRKRQWPTKSOTDRGAS AGKRRrFWT <km 

EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGSQRARWNTDHGSPEQLQIPVTDSGRHILEDSCAELGESK 
EHKESRMVTETEE TQEKKAES KEP I BEE PTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTGQTPLHSQGGGGGSGGGRRRTPRGMP2CEKYEPPDPRRMYTI " 
MSSEEAANGKKSHWAELEISGKVRSLSASLWSLTHLTALHLSDN 
SLSRI PSDI AKLHNLVYLDLSSNKIR 


5763 


3 


429 


IJDKJDTGLIMLIARLDYELIQRFTLTIIARDGGGEETTGRVRINV " 

LDVNDNVPTFQKDAYVGALRENEPSVTQLVRLRATDEDSPPNNQ 

ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYIi 
TVMAMDAGN 


5764 

sm 


19 


441 


VCARACGEMRQLUIPIDRQRYDENEDLSDVEEIVSVRGFSLEEK 

LR^QLYCX3DFVHAMEGKDFNYEYVQREAI»RVpIiIFREKDGLGIK 

MPDPDPTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKL 




3 


825 


QAILRLNNSHQPPTSSSNSKDCGGPAS^GAGATAALADGLkFAS 
VQASAPQGNSHKETS KS KVKRS KTS KDANKSLPSAAXYGI PE I S 
S7GKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPS GGHL YGFGAKS NGGGAS P FHCGGTG SGS VAAA 
GE VS KS APDSGLMGNS MLVKKE EEEEESHRR I KKLKTEKVDPL F 
TVPAPPPHV 


5766 


1608 


663 


SGLF S VDPASSQAMELSDVTL t EGVGNEVMVVAG VVVL t ijAliVL " 

AWLSTYVADSGSNQLLGAIVSAGDTSVLHLGHVDHLVAGQGNPE 

PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEPSLEHLLD 

IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 

TBELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 

RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 

SLMVPVFWLLG WWYFRINYRQFFTAPATVS LVGVTVFFS FLV 

FGMYGR 


5767 


2 


032 

.__ 


NPRATPRPPl-RPkLRTGTEVILWYIJ^WRALMKRKRMKANIKLVG 
SG FPLPSSDLDDSLTEE IDEKIGFRNDANFDWQNVADFRDAGGS 
LtTEVKVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENELPDFP 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleoti.de 

location 

corresponding 

to first 

3nii nn r~* ~{ A 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segmenc containing signal peptide 
(A=Alanine, OCysteine, D»Aspartic Acid, E=» 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
WaTryptophan, Y=Tyrosine, X« Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HiDhb b 1 LNStf SK^YDEPHLIjVNIEKQKLELEKRRIjDIEAER 
LQVEKERLQIEKERLRHLDMEHERLQLEKBRLQIEREKLRLQIV 
NSEKPSLENELGQGEKSMLQPQDIETEKLKLERERLQLEKDRLQ 
FLKF3SEKLQIEKERLQVEKDRLRIQKEGHLQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA" 
AAMVAKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 
SRVTSLANLIPPVKATPLKRFSQTLQRSISFRSESRPDILAPRP 
WS RNAAPS S TKRRDS KLWS ETFDVC 


5769 


38 


667 


TK7KKG VKE KATDQS VKAFAEHCPE LQ YVGFMGCS VTS KG VIHL 
TKLRNLSSLDLRHITELDNETAMEIVfG^CKNLISLiNLCIiNWIIN 
DRCVEVIAKEGQNLKEL YLVSCKITDYALI AIGR YSMTI ETVDV 
GWCKEITDC^ATLIAQSSKSLRYLGLMRCDKVNEVTVEQLVQQY 
PHI TFS TVLQDC KRTLERA YQMG WT PNMSAAS S 


5770 


1 


484 


DSRRYDVKTKKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTLA 
FAS QLKKTSLSLTPDVPE ADLS EVD PKLVSNLM P FQRAGVNFAI 
AKGGRLLLADDMGLGKTIQAICIAAFYRKEWPLLVWPSSVRFT 
WEQAFLRWLPSLS PDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGSDHSSU3LEQLQDYMVTLRSKLGPLEIQQPAMLLRE 
YRLGL P I QD YCTGLLKLYG DRR KFLLLGMR P FI PDQD IG YFEG F 
LEG VG I REGG ILTDS FGR I KRSMSS TSASAVRS YDGAAQRPEAO 
AFHRLLADITKDIE 


1 5772 


148 


383 


EFNIALVSPSHPQIKAEDDQPLPGVLLSIiSGGLFRSNLLTQDNG 
ILTFSNLVTCSAIYHLPVFPEREPGCSMRDLRVA 


5773 


2 


723 


PRVRSKHNFCFMEMNTRLQVEHPVTEMI1X5TDLVEWQLRIAAGE ' 
KIPLSQEEITLQGHAFEARIYAEDPSNNFMPVAGPLVHLSTPRA 
DPSTR I ETGVRQGDEVSVHYDPMI AKLVVWAADRQAALTKLR Y$ 
LRQYNIVGLHTNIDFLLNLSGHPEFEAG^7VHTDFrPQHHKQLLL 
SRKAAAKESLCQAALGLILKEKAMTDTFTLQAHDQFSPFSSSSG 
RRLN IS YTRNMTLKDGKNS K 


5774 


2 


592 


FVE BEN I R WRCGGS ELN FRRAVFS AD S KY I FC VSGDF VKVYST 
VTEECVHILHGHRNLVTGIQLNPI^NHLQLYSCSliDGTIKLWDYI 
DGI L I KTFI VGCKLHALFTLAQAEDS VFVIVNKEKPDIFQLVSV 
KLPKSSSQEVEAKELSFVLDYINQSPKCIAFGNEGVYVAAVREF 
YLS VYFFKKETTSRVTLS SS 


5775 


3 


538 


S 5 GC CD PAAPS SLAEAATMPVS KC PKKS ES LW KG W DR KAQRNG U 
RSQVYAVNGDYYVGEWKDNVKHGKGTQVWKKKGAIYEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDW CG SQRSGWGRMYYSNGD I YEGQWENDKPNG EGM LRLSQN P 
RP 


5776 


2 


484 


Rt^DCV<^NLSESI^TLCPSKGLLFVPPDiDRR,TVELRIX3GNF 
IIHISRQDFANMTGLVDLTLSRNTISHIQPFSFLDLESLRSLHL 
DSNRLPSLGEDTLRGLVNLQHLIVNNNQLGGIADEAFEDFLLTL 
EDLDLS YNNLHG PAVGLRGDAW VQPS TS 


5777 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ 

VKKLEQALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 

GG5GSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 

YRGSEGSPTKPFINPLPKPRRTFKHAGEGDKDGKPGIGFRKEKR 

NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 

DLLQSSSESSRVDWYAQTKLGLTRTLSEENVYEDILDPPMKENP 

YEDIELHGRCU3KKCVLNFPASPTSSIPDTLTKQSLSKPAFFRQ 
NSERRNV 


5778 


1 


1210 


QRRQSVSRLLLPVFLLEPPAFPfiT.PPPDggpr'r'DPTrf^uAPPr.^ — 
GGPCWLQLEE VPGPGPLGGGGPLRS PSS YSS DELS PG E PLTS P P 
WAPLGA PERPEHLLNR VLERLAGGATRDS AAS D I LLDD I VLTHS 
LFLPTEKFLQELHQYFVRAGGMEGPEGIiGRKQACLAMLLHFLDT 
YQGLLQEEEGAGHI IKDLYLLIMKDESIiYQGLRBDTLRLHQLVE 
TVELKIPEENQPPSKQVKPLFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMPDHS YVTIRSRLSASVQD I LGSVTEKLQ YSEEPAGREDS 
LILVAVSSSGEKVLI*QPTEDCVF1*ALGINSHLFACTRDSYEALV 
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SEQ 
ID 
NO: 


~ Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 

rnrr*^ ar"i<*m ri i Tin 

A. QgUUliulilU 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AsAlanlne, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
HsHistidine, I=Isoleucine, K> Lysine, 

Leucine, M-Methioninc, N=Asparagine , 
P= Proline, Q«Glutamine, R*Arginine, 
S»Serine, T-Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 








PLPKEIQVSPGDTEIHRVEPEDVANHLTAPHWELPRCVH3LEFV 
DYVFHGE 


57-79 


138 


1S71 


EAVQVLI KHS ADVNARDKNWQTPLHVAAANKAVKCAE VI I PLLS 
SVNVSDRGGRTALHHAALNGHVEMVNLLLAKGANINAFDKKDRR 
ALHWAAYMGHLD WALLINHGAE VTCKDKKGYTPLHAAASNGQ I 
NVVKHLLNLG VE I DEI NVYGNTALHI ACYNGQDAVVNE LI DYGA 
NVNQPNNNG FTP LH FAAAS THGALCLELLVNNGADVN I QS KDGK 
S PLHMTAVHGRFTRSQTLI QNG GE I DCVDKDGNTPLHVAARYGH 
EIilOTLITSGAI)TAKCGIHSMFPIiHLAALNAHSDCCRKLLSSG 
QKY S I Vo LFSNEH VLSAGFE I DT PDK FGRTCLHAAAAGGNVEC I 
KLLQS SGADFHKKDKCGRTPLHYAAANCHFHCIETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCLSFLLQNDANPS I RDKEG YNS IHYAAAYGHRQCLELLLE 
RTNSGFSESDSGATKSPLHLAVSEMP 


5780 


154 


624 


QPFR VI TCLP F KGP DYRL YKS E PELTT VAE VDE SNGEE KS EP VS 
EIETSWKGSHFPVGWPPRAKSPTPESSTIASYVTLRKTKKMM 
DLRTERPRSAVEQLCLAESTRPRMTVE EQMERI RRHQQACLREK 
KKCjLNVI GASDQ S PLQS P SNLRDNP 


5781 


19 


941 


RGSLGGHPWRIfPMRAASQGCbPVSFVTGPHQERAYGGRGPGGAF 
PAPPVSGTCPPDLIYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 
VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPLP 
VAPSRTCGGC*TWDPALLVSP/PQGDSTPELPAP\QQPTGGPSR 
CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 


1237 


drsmmsmaadsytdsytdtyteayMVpplppeepptmpplpp'ee 
p pmt pplppeepp egpal pteq s altaentwpte vpslps ees v 

SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPESS I TLTPVESAWAB EHE WPERP VTCMVS ETPAMS AEPT 
VLASE P PVM SETAET FDSMRAS GHVAS E VSTS LLVPAVTTP VLA 
ESILEPPAMAAPESSAiVIAVLESSAVTVLESSTVTVLESSTVTVL 
EPSWTVPEPPWAEPDYVTIPVPWSALEPSVPVLEPAVSVLQ 
PSM I VSEPS VSVQES TVTVSE PAVTVS EQTQVI PTEVAI ESTPM 
I LESS I MS S HVM KG I N LSSGDQNLAP E IGMQE IALHS GEE PHAE 
EHLKGD FYE S EHGINIDLNINNHLI AKEMEHNTVCAAGTS PVGE 
IGEEKI LPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALE PD 
ATG\TSKGI3FTTASTLSLVNKYDVDLSLTTQDTEHDMLISTSP 
SGGSEADIEGPLPAKDIHLDLPSNINLVSSD1NEPLPVKRD\DQ 

tlaali\sl:<essggekevppps*rehlpdsgfsaniedinead 

LVRPVSSPRTWNVLPSPRAGL\EGP\LLASDFGPVQNLYSSPW 
\SSMP\ERASGS\SSGEKGG\YEIFVKVKDTHEKSKKNKNRDKG 
EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 
HRS\QTRSRSRS/RDRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 
RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 
S VGRRRS FS I S PSRRS RTPSRR SRTPSRRSRTPSRRS RT PSRR S 
RTPSRRSRTPSRRRRSRSWRRRS FSIS PVRLRRSRTPLRRRFS 
RS P I RRKRS RS S ERGRS PKRLTDLDKAQLLE I AKANAAAMCAKA 
GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 
DDDVIVNKPHVSDEBEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PTPP KS QVTLTKE FPVSS GSQHR KKEADS VYGEWVP VEKNGEEN 
KDDDNVFSSNLPSEPVDISTAMSERALAQKRLSENAFDLEAMSM 
LNRAQE RI OAWAQLKS I PGQFTGS TG VQ VLTQEQLANTG AQAW 1 
KKDQ FLRAAP VTGGMGAVLMRKMGWREG EGLGXNKEGNKE P I L V 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQPPEFLLVHDSGPDHRKHFLFRVLINGSAYQPNCMFFLNR 
Y 


5783 


1693 


698 


DSGLRVAFTMEG ISNFKTPS KLSEKlOkS VLCSTPTIN I PAS PFM 
QKLGFGTGVNVYLMKRS PRGLSHS PWAVKKINP ICNDHYRS VYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSL 
NDLIEE/PI*SQ/PKILF<JQP/LILKVALNMARGLKYLHQEKKL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticle - 
ift-rtidiune, i.-v.ysteine, Ch=Aspartlc Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Ieoleucine, K-Lysine, 
L=Leucine, M*Methionine, N-Asparagine, 
P=Proline, 0=Glutaminf» RcArainin^ 
S=*Serine, T=Threonine, V= Valine, 
""Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDI KSSN WI KGDFBTI KI CDVGVSLPLDE3JMTVTDPEACYI 
GTEPWKPKEAVEENGVITDKADI FAFGLTLWEMMTLS I PHINLS 
NDDDDEDKTFDESDFDDEAYYAALGTRPPINMEELDESYQKVIE 
LFSVCTNEDPKDRPSAAHIVBALSTDV 


5784 


2669 


1388 


PR VRPR VRTDHN YYI SRI YGPS DS ASRDL WVNI DQM E KDKVK I H ' 

Cj T T i55 NTH R O AARVNT »S TIT) 17 D V VrstJ I7T .13 T? T T*\ 1 R TW t? t vnv^ r»t n rt t 

RM LTATQ Y I APLMAN FDPS VSRNSTVRYFDNGTAL WQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKE I PVL VTQ I SS TNHPVKV 
GLSDAFWVHRIQQIPNVRRR7IYEYHRVELQMSKITNISAVEM 
TPL PTCLQFNRCGPCV S SQ IG FNCS WCS KLQRCSSGFDRHRQDW 
VDSGCFEESKEKMCENTEPVET\FLEPPQP*2RQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VGI LI LVLI VATAIL VT VYM YHH PTSAAS I FFI ERRPS R 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5785 


2669 


1388 


PRVRPRVRTDkNYYISRtYGPSDSASRDLWVNIDQMEKDKVKIH 
au&jn AnKy>wu<vwjji>rDt l?rYGHFLREITVATGGFIYTGEVVH 
RMLTATQ Y I APLMANFD P SVS RNS TVRY FDNGTAL WQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FG YKEIPVLVTQISSTNHPVKV 
GLSDAFWVHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPT CLQ FNRCGP CVS SQIG FNCS W C S KLQRCS SG FDRHRQDW 
VDSGCPEESKEKMCENTEP VET\ FLEPPQP* BRQPPSSGS *LPP 
E / DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
rjv^ux. vui IjX bvbl VAi AILVTVYMYHHPTSAASIFFIERRPSR 
W PAM K FR RGS GHPAYAEVE P VGE KEG F I VS EQC 




25*2 


1674 


SYKLPAAERRASSCSQPPTPTRRRWPAPGRTSRGHRPQM*SGTP " 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRFC*SLtN*M 
S *H*KRNLSQRSSSMSRRPLSCARPHR* *RQGLTVAARLP1'WAK 
SPPLACS FCQAAQKSQS LSSGRSTR* PERMS FRP \SPPGNPAIP 
SLAPSSRP/ P KGRPQCTWI PSRWPAS PTAPPTTT* APTSS PGST 
GRSMMTCPTRWTATPWS ARASSRPRNWPTP * WRPSGRLSTV* RA 
TGGSTATAPPKRFPRNWNPMMAE 


5787 


2 


1460 


KAS AAS VTS LADEVNCP \ ICQGTLKEAGSLSNCG/HKNFCRACL " 
T\RYCEIP\GPD \ LEES P \ TCP\ LCKE PFRP \GS FRPN WQLANV 

venierlqlvstlglgeedvcqehgekiyffceddemqlcwcr 

c*ft\j£ift>\l rt lPiKi* 1aEU/4A\AFYREQIHKCLKCLIKEREEIQEIQS 
RENKRMQVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLAQL2S 

QDGDILRQRDE fdllvagei crfsalieeleeknerparelltd 

IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPLQREMKI4F 
LEKLCFELDYEPAHISLDPQTSHPKLLLSEDKQRAQFSYKWQNS 
PDNPQRFDRATC^IAHTGITGGRHTVTVVSIDLAHGGSCTVGVVS 
EDVQRKGELRLRPEEGVWAVRLAWGFVSALGSFP\TRLTLKEQP 
RQVRVS LDYE VGWVTFTNAVTREPI YTFTASFTRKVIPFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHS VSGRS S A YGDATAEGHPAG PGS VS SS TGA I S TTTGHQEGDG " 
SEGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 
A I P YMQVI LMLTTDLDGEDE KD KGALDNLLSQL I AELGMDKKDV 
S KKNERS ALNE VH LWMRLLS V FMS RTKS GSKSSICESSSLISS 
ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 
HTTSSPPDMSPFFLRQ WKGHAADVFEAYTQLLTEM VLRLP YQ I 
KKITDTNSRIPPPVFDH5WFYFLSEYLMIQQTPFVRRQVRKLLL 
FICGSKEKYRQLRDLHTLDS \H VRGIKKLLEEQG I FLRASWTA 
S PQSALQ YDTL1 S LMEHL KACAE I AAQRT I NWQKFC I KDDS VL Y 
FLLQVS FLVDEGVS PVLLQLLS CALCGSKVLRALAASSGSSSAS 
SS PA P VAAS SGQATTQ5KSSTK KS KKB E KE KEKDG ETS GSQEDQ 
LCTALVNQLNKFADKETLIQFLRCFLLESN5SSVRWQAHCLTLH 
I YRNSS KSQQELL LDLMWS I W PELPAYGRKAAQFVDLLG YFSLK 
TPQTEKKLKE YSQKAVE I LRTQNHILTWHPNSN I YNTLSGLVEF 
DGYYLESDPCLVCNNPEVPPCYI KLSS I KVDTRYTTTQQWKLI 
GSHTI S KVTVKIGDLKRTKM VRT I NL Y YNNRTVQAI VELKNK P A 
RWHKAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 
TETLQCP RCS AS VPANPGVCGNCGENVYQ CHKCRS IN YDEKDP F 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pept£de~~ 

(A=Alanine, CnfVch^ -5 nt» n-&onay(-ir< &/->< c> 

Glutamic Acid, ^Phenylalanine, G=Glycine, 
K*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Mefchion.i.ne, NeAsparagine, 
PaProline, Q=Glut amine, R«Arginine, 
SoSerine, T=Threonine, V«Valine, 
NoTryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








lcnacgfckyarfdfmuyakpccavdpieneedrkkaVsnintl 

LDKADRVYHQI^GHRPQIjENLLCKVNEAAPEKPQDDSGTAGGIS 
STS ASVNRYI LQLAQE YCGDCKNS FDELSKI IQKVPASRKELLE 
YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 
CASAVTEHCITLLRALATNPALRHILVSQGLIRELFDYNLRRGA 
AAMREEVRQLMCLLTRDNPEATQQMNDLI IGXVSTALKGHWANP 
DIASSLQYEMLLLTDSISKEDSCWELRLRCALSLFLMAVNIKTP 
VWENITIiMCLRILQKLI KPPAPTSKKNKDVP VEALTTVKPY CN 
EXHAQAQLW LKRD P KAS YDAWKKCL P I RG XDGNGKAP5 KS ELRH 
LYLTEKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVLFTPAT 
QAAROAAC? I VEALAT I PSRKQQVLDLLTS YLDELS IAGECAAE 
YI^YQKLITSAHWKVYLAARGVLPYVGNLITKEIARLLAIiEEA 
TLSTDLQQG YALKS LTGLLS S F VE VE S I KRHFKS RLVGTVLNG Y 
LCLRKLWQRTKL I DETQDMLLEMLEDMTTGTESETKAFt4AVCI 
ETAKRYNLDDYRTP VFI FERLCS 1 1 YPEENEVTEFFVTLEKDPQ 
QEDFIiQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDLVALLED 
DSGMELLVNNKI ISLDLPVAEVYKKVWCTTNEGE PMRI VYRMRG 
LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 
RLAGI RDFKQGRHLLTVLL KLFS YC VKVKVNRQQ LVKLEMNTLN 
VMLGTLNLAL VAEQE S KDS GGAAVAEQVLS 1 ME I \ I QAE PNVEP 
LSEDKGNLLLTGDKDQLVMLLDQINSTFVRSNPSVLQGLLRIIP 
YLSFGBVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 
IAAGI K\NNSNGHQL\KDL\ ILQKG ITQNALD\ YMKKHIP/SAA 
R I W DADI \ WKS FCLRPALP F ILRLLRG LA X QHPG TQ VL I GTDS I 
PNLHKLEQVS \SDEG 1GTLA\ENL\ LESLREHPDVNKKI DA\AR 
RETRAEKKRMAMAMRQKALGTLG \ MTTNEKGQWD/TRTALLEA 
uwctbiatr NviJX^^xtKfcGiKrQPTKVI^IYTFTKRVVLGGVW 
ENKPRETS RATS TVSH FN I VH YDC \ HLA\ AVS LARGREE WE S AA 
LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQRBP 
TYQLNIHDIKLLFLRFAMEQSFSADTGGGGRESKIHLIPYIIHT 
GLYVLNTTRATSREEKNLQGFLEQPKEKWVESAFEVDGPYYFTV 
LALH I LP PEQWRATR VE I LRRLLVTSQARAVAPGGATRLTD KAV 
KDYSAYRSSLLFWALVDLIYNMFKKVPTSNTBGGWSCSLAEYIR 
HNDMP I YEAAD KALKTFQEEFMPVETFSEFLDVAGLLSE I TDPE 
SFLKDLLNSVP 


5789 


1 


246? " 


LPLHAVEKTGRPGQPALKMPGKLR£DAGLESDTAMKKGETLRKQ 
TEEKEKKEKPK5DKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 
VTKNEEPSEEE IDAPKPKKMKKEKEMNGETREKS PKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEXPVEQKEG\AFSNFPISEETIKL 
L KGRG VTFLF P 1 QAKTFHHVYSGKDL I AQARTGTGKTFS FAI PL 
1 EKLHG\ELQDRKRGRAPQVLVLAPTRELANQVSKDFSDITKKL 
SVACFYGGTPYGGQFERMRNG I DILVGTPGRIKDHIQNGKLDLT 
KLNHVVLDEVDQMLDMGFADQVEEILSVAYKKDSBDNPQTLLFS 
ATCPHWVFNVAKK YM KSTYEQVDLIG KKTQKTAI T VEH LAI KCH 

WTORAAVIGDVIRVY<;nHnfiRTTTP , rPTlfK'PnnTrT cnMcarvrin 

AQS LHG DIP QKQRE I TLKGFRNGS FGVLVATNVAARGLD I PE VD 
LVIQSSPPKDVESYIHRSGKTGRAGRTGVCICFYQHKEEYQLVQ 
VEQKAGIKFKRIGVPSATEIIKASSKDAIRLLDSVPPTAISHFK 
QS AEKL I EEKGAVEALAAALAH I SGATS VDQRSLINSNVGFVTM 
I LQCS I EMPN I S YAW KELKEQLGEE I DS KVKGMVF LKGKLGVCF 
DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRS FS KAFGQ 


5790 


3786 


1585 


ARRQRDPLQALRRRNQELKQQVDSLLSESQLKEALEPNI«lOHtV 
QRCIQLKQAIDENKNALQKLSKADESAPVANYNQRKEEEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEBESESEDSEDSGGEEE 
DAEEEEEEKEENESHKWSTGEEY I AVGDFTAQQVGDLTFKKGE I 
LLVI EKKPDGWWI AKDAKGNEGLVPRTYLE PYS EEEEGQESS EE 
GSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAISEAGIFCLVN 
HVSFCYLIVLMRNRMETVEDTNGSETGFRAWNVQSRGRIFLVSK 
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| SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine . CsCvshpins n-aenarh^ nni J n» 

Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V«Valine, 
"-Tryptophan, Y-Tyrosine, X<=unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








pvlqqinxvdvlttmgaipagfrpstlsqlleegnqfranVflq™ 

PELMPSQLAKRDLMWDATEGTIRSRPSRISLILTLWSCKMIPIjP 
GMSIQVLSRHVRLCLFDGNKVLSN1HTVRATWQPKKPKTWTFSP 
QVTRILPCLLDGDCFIRSNSASPDLGILFEIiGISYIRNSlGERG 
ELS CGWVFLKLFDASGVPI PAXT YELFLNGGTPYE KG I EVDPS I 
SRRAHGS VF YQIMTMRRQPQLL VKLRS LNRRS RNVLS LL PETL I 
GNMCSIHIiLIFYRQILGDVLLKDRMSLQSTDLISHPMLATFPMli 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PKSFLKVPRFLLVYH 
XGCVLPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQALLS PDGVHEPFDLSEQTYDFLGEMRKNAV 


5731 


3 


1636 


LRVAEFAGTSR/IGAGLIQPLHRAPARDHGLLRGGAAPALSVSH " 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
NRSIQQGFCFNILCVGETGIGKSTLIDTLFNTNFEDYESSHFCP 
NVKLKAQT YELQESNVQL.KLT I VNTVGFGDQINKEES YQ? I VDY 
IDAOFEAYLORFTiKT KT5Qr.P*TvuncDTinrr«T.vn»TC!r>n»/>tiPT vnr 

DJjLTMKNLDSKVYI I P VIAKADTVS KTELQKFKIKLMS ELVSNG 
VQ I YQFPTDDDTI AKVNAAMNGQLPFAVVGSMDEVKVGNKMVKA 
RQYPWGWQVE>IENHCDFVKLREMLICTNMEDLREQTHTRHYEL 
YRRCKLEEMGFTDVGPENKPVSVQETYEAKRHEFHGBRQRKEEE 
MKQM FVQRVKE KEA I L REAERE LQAKFEHL KR LHQE ERMKLEEK 
RRLLEEE I 1 AFS KKKATS E I FH SQS FLATGSNLRKD KDRKNSQF 
FVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICFITS1FGEQPQ 
LLI FMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAPS PAW W CG VFWYWHTC W VMYG I V YTRP CSGD AS C I QPY " 
LARRPKLQL\RHS FTTTRSHLGAENNIDLVLNVEDFDVES KFER 
TVNVSVPKKTRNNGTLYAYIFLHHAGVLPWHDGKQVHLVSPLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAI* 

RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FG FSEKDAD EVKG I FVDTNL YFLALTFF VAAFHLLFDFLAFKND 
ISFV/KKKKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVWGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPSPAWWCGVFWYVVHTCWVMYGIVYTRPCSGDASCIQPY 
LARRPKLQL\RHSFTTTRSHIiGAENNIDLVLNVEDFDVESKFER 
TVNVS VP KKTRNNGTLYAY I FLHHAGVL PWHDGKQVHLVS P LTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDGSSLPADVHRYMKMTOT/STTrviTVT dtt umrtT okt 

RVKDLMVINRS TTE LPLTVS YDKVS IjGRLRFW I HMQDAVYS LQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
ISFWKKXKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTI FWRGLM PEFQFGT Y5ESERKTEE Y 
DTOAMK YLS YLL YPLCVGGA VYSLLNI KYKS W YS WL IWSFVNG V 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDDWFLVYLYQRWLYP VDKRRVNEFGESYE 
EKATRAPHTD 


"5794 


1 


5016 


MGPRLSVWlililiLPAALLLHEEHSRAAAKGGCAGSGCGKCDCHGV^ 
KGQKGERGLPGLQGyiGFPGMQGPEGPQGPPGQKGDTGEPGLPG 
TKGTRGPPGASGYPGNPGLPGI PGQDGPPGPP3I PGCNGTKGER 
GPLG P PGL PGFAGNPGPPGL PGMKGDPGE I LGHVPGMLL KGERG 
FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKGQM 
GLS FQG PKGDKGDQG VS GP PGVPGQAQVQ EKGDFATKG EKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTS LPG PSGRDGLPGPPGS PGP PG Q PGYTNG I VECQ PG P P 
GDQG PPG IPGQPGFIGEIGE KGQ KGESCL I CD I DG YRG P PGPQG 
PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK 
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SEQ 
ZD 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine. CsCvsfpinp n = a oyi =»■»->-■; j o 
Glutamic Acid, F* Phenyl alanine, G=K31ycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
I^Leucine, M-Methionine , N«Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S*Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 
PKGSPGSVGLKGBRGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 
DKGQAGF PGG PGS PGL PGP KGE PGKI VPL PGP PGAEGLPGS PG F 
PG PQGDRG FPG TPGR \ PGL\ PGEKGAVG\QPG IGFPG P PG P KGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGL 
PGL PG I PGT PGE KGS I G VPG VPGEHG AIGP PGLQGI RGE PG P PG 
LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 
SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 
P3LKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 
AGPPGIGIPGLRGEKGDQGIAGFPGSPGEKGEKGS IGIPGMPGS 
PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 
GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 
GWPGAPG VPG P KGD PGFQGM PGIGGS PG ITGS KGDMG PPG VPG F 

QGPKGLPGL0GTIOTD(Tfinnf3T/'D'iaVrST.D/2Oti^Oli#^nvrN-r-rer^e.«. 

GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGIjPGSMGPPGTPSVDHGFL 
VTRHSQT I DDPQCPSGTK I L YHGYS LL Y VQGNERAHGQDLG TAG 
SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 
ITGENIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWr 
GYSFVMHTSAGAEGSGOALAQPrqpt pppDcanT?TT?r<tj/*e/^nvTvr 

YYANAYSFWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMR 
RT 


5795 


1192 


61 


STRSPTVEYlSAHPHIIiFMliLKGYEAPQIALRCGIMLRECIRHE 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQN YDTI FED YEKLLQ S2NYVT KRQS LKLLG E L I LDRKN 
FAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPH 
KTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQI 
RDLKKTAP+RALRDSKR . 


579* 
5797" 


• 2 


1078 


GRVGWEljWGMYISPPKDMWnArtnpQT.DTOTDaMT/^r'ciMrtrRTTivT? *~ 
FGE I GLLD PGMDVYGGENI ELG I K VWLCGGSME VL P CS RVAH I E 
RKKKPYNSNIGFYTKRNALRVAEVWMDDYKSHVYIAWNLPLEKP 
GIDIGDVSERRALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGE 
LRNW KAKD VCLDQG PLENHTA I L YPCHG WG PQLARYTKEG FLKX 
GALGTTTLLPDTRCIjVDNS KSR L PQLLD CDKVXS S LYKRWNFIQ 
NGA I MNKGTGRCLE VENRGLAG I DL I LR SCTGQRWT I KNS I K*R 
EGAGALE PGPQDMAAP PNI WTS CPGGETARGROVT.nr ppp a ccr 
QHRDPG 




2 


891 


PRVRQKTLVDVTLENSNIKDQIRNIiQQTYEASMDKLREKQRQLE 
VAQVENQLLKMKVESSQEANAEVMREMTKKIiYSQYEEKLQEEQR 
KHSAEKEALLEETNS FLKAIEEANKKMOAAEISLEFKnOP TP PT 
DRL I ERME KERHQLQLQLLEHETEMSG ELTDS DKERYQQLE EAS 
AS LRER I RHLNDMVHCQQ KKVKQMVEE I ES LKKKLQQKQLL I LQ 
LLE K I S FLEG ENNELQSRLDYLTETQAKTE VETRE IGVGCDLLP 
SQTGRTREIVMPSRKYTPYTRVLELTMKKTLT 


5798 


644 


115 


KI LGS RWKSMSNQEKQP Y YEEQARLS KIHLEKYPNYttYKtRPKR 
TCIVDGKKLRIGEYKQLMRSRRQEMRQFFTVGQQPQIPITTGTG 

wypgaitmatttpspqmtsdcsstsaSpepslpviqstygmkt 
dggs lagnem ingedemem yddyeddpxsd ys seneapeav s an 


i 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL *" 
TLSSVASTDVLATVLEEMPPFPERESSILAKLKRXKGPGAGSAL 
DDGRRDPSSNDINGGMBPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPP I P 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVOFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
NriECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDF FQR WKQLSLPQQEAQ K I FKANHPMDAE VTKAKLLGFGSA 
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SEQ 
ID 
NO: 


Predicted 
hpcr i nninn 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucicOLiae 
location 
corresponding 
to firot 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signaJ. peptide - 
(A=Alanine, CsCysteine, DaAspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
n»niaciaine, i*»J.soieucine, Kojjysine, 
LoLeucinc, M=Methionine, N=Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S«Serine, ^Threonine, v=Valine, 
WsTryptophan, Y»Tyrosine, X= Unknown , **stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVDPNPENFVGAG I IQTKALQVGCLLRLEPNAQAQMYRLTL 
RTS KE PVSRHLCE LLAQQ P 


5800 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL 
TLS S VASTDVIATVLE3MPP FPERESS I LAKIjKRKKG PGAGS Au 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTS VQFQNFS PTWH PGDLQTQLAVQTKR VAAQ VDGGAQVQQ VL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMDAEVTKAKLLGFGSA 
LLDNVDPNPENFVGAG 1 1 QTKALQ VGCLLRLE PNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITSIKINRVDPSESLSIRLVGG^feTPLVHIII 
QHI YRDGVI ARDGRLLPGDI ILKVNGMDI SNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDSFHVILNKSSPEE 
QLG I KLVRKVDEPGVFI FNVLDGG VAYRHGQLEENDRVLAINGH 
DLRYG SP ES AAHLI QAS E RR VHL WS RQVRQRS PDI FQEAGWNS 
NGSWS PGPGERSNTPKPLHPT I TCHEKWN IQKDPGB SLGMTVA 
GGASHREWDLPI YVISVEPGGVI SRDGRI KTGD I LLN VDGVELT 
B VSRSEAVALLKRTSSS I VLKALEVKEYE PQEDCSS PAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
GYEEYNGNKPFFIKSIVEGTPAYNDGRIRCGDrLLAVNGRSTSG 
MIHACLARLLKELKGRITLTIVSWPGTFL 


5802 


3 


290 


CFSLYQIMERIMDLPTLLRHAFREMFSVGGLFHMFRIRIILCLM 
GAFFYLISPLDFVPEALFGILGFLDDFFVIFLLLIYISIMYREV 

TTflDf TO 


5803 


2234 


1299 


EAQFGTTAEIYAY^EEQDFGlBIVKVKAIGRQ-RyRVLELRTQSD 
G IQQAKVQ I L P3 CVL P S TMS AVQLES LNKCQ IFPSKPVSREDQC 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
WDENL KDDS L PSNP I DFS YRVAACLP I D D VL R I QLLKIG5 AI QR 
LRCELD I MNKCTS LCCKQCQET E I TT KNE I FSLS LCG PMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKXCA 
SHIGWKFTATKXDKSPQKFWGLTRSAIiLPTIPDTEDEISPDKVI 
LCL ' 


5804 


2 


1707 


EMEKQRaEEQRKRTEEERKRRIEQDMIiEKRKiQREtAKRAEQIE 
D INNTG TES ASEEGDDS LL ITWP VKS YKTSGKM KKN FEDLE KE 
REEKERIKYEEDKRIRYEEQRPSLKEAKCLSLVMDDE1ESEAKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDE ENQDTAK I FKGYRPGKLKLS FEEMERQRREDE KR 
KAEEEARRRI EEEKKAFAEARRNMVVDDDSPEMYKT ISQEFLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKflEFEQLRQEM 
GEEEEENETFGLSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 
EKE IQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVR PARKS 
EAPFTHKVNMKARFEQMAXAREEEEQRRIEEQKLLRMQPEQREI 
ut^u^r^i\x^£!iCiti t £,tut,ijtt lPlNbo r/UiDBBQTRSGAPWFKKPLKNT 
SWDSEPVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYIERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5305 


3 


776 


I lou l L»bU V i KbKIRW WI EENGGNGNIS VDDLIALLDLAEHASS 
AFKESQQQSBDREYEVKERLYPKSKRRYDTYNIAGYQGEIEVGL 
YTIQILQLIPFFDNKNELSKRYMVNFVSGSSDIPGDPNNEYKLA 
LKNYI PYLTKLKFSLKKSFDFFDE YFVLLKPRNNI KQNEEAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGLGSKFSBPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAJSTMKCIVNEYTFLLK 


5806 


1257 


B77 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWL RS LLKP I HVFFGAAI LSLS I AS VI SG I NE KL FFS L KNTT 
RPYHS LPSEAVFANSTGMLWAFGLLVLYILLAS SWKRP 


5807 


22*7 


1302 


RFS KKTFRRPMAVDIQP ACLGLYCGKTLLFKNGSTE I YGECGVC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFFIEW 
YSGKKSSSALFQHITALFECSMAAIITLLVSDPVGVLYIRSCRV 
LMLS DW YTMLYNPS PDYVTTVHCTHEAVYPLYTI VF I YYAFCLV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot id(» 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A^Alanine, 0=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K*»Lysine, 
L=Leucine, M=Methionine, N~Asparagine, 
r«»rioj,iiie ( UBbiucamine, RoArginine, 
S»Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\'^iuo3iuic jjULlcULlUC lUocrLlOIi/ 








LMMLLRPLl>VKKIACGLGKSDRPKSIYAALYPt?PILTVL3AVGG" 
GLLYY A FP Y I ILVLS LVTLAVYM S ASB I ENC YDLL VR KKRL I VL 

FSHWLLHAYGI IS ISRVDKLEQDLPLLALVPTPALPYLFTAKPT 
EPSRILSEGANGH 


5806 


2 


433 


SLPDSG WE YLSNGGVADNHKDFGELRYNECLMNFS CNGKNGSS 
EGR ITHG FQLKS A YENNLM P YTNYTFDFKGV ID Y I FYS KTHMNV 

LGVLGPLDPQWLVENNITGCPHPHIPSDHFSLLTQLELHPPLLP 
LVNGVHLPNRR 


5809 


464 


2422 


ILVPGFQGILHPGVYCALQSQHQAQELVADIDECEVSGLCRKGG 
RCVNTHG3FECYCMDGYLPRNGPEPFHPTTDATSCTEIDCGTPP 
EVPDGYI IGNYTSSLGSQVRYACREGFFSVPEDTVSS CTGl/GTW 
BSPKLHCQEINCX3NPPEMRHAILVGNHSSRLGGVARYVCQEGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSLFNDTCVRWQ 
INSRRINPKISYVISIKGQRLDPMESVREETVNLTTDSRTPEVC 
IALYPGTNYTVNISTAPPRRSMPAVIGFQTAE VDLLEDDGS FN I 
S I FNETCLKLNRRSRKVGSEHMYQFTVIjGQRWYIiANFS HATSFN 
FTTREQ VP WCLDL YPTTDYTVNVTLLRS P KRHS VQI T IATP PA 
VKQTISNISGFNETCLRWRSIKTADMEEMYLFHIWGQRWYQKEF 

aqemtfnissssrdpevcldlrpgtnynvslralsselpwisl 

TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LPLALQSTFS CDSEGASS FFSNAS DADGYVAAELLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDS S LMLLGMAG VGLGSLAWI ILTFLSFSAV 


5810 


3 


1641 


KVFGTHKDHEVSTLDTAISAVKVQLAEFLENLQEKSLRIEAFVS 
HIES FFNT I EENCS KNEKRIjEEQNEEMMKKVLAQYDE KAQS FEE 
VKKKKME FLHEQM VHFLQSMDTAKDTLETI VREAEELDEAVFLT 
S FEE I NERLL S AMESTASLEKMPAAFSLFEHYDDSSARS DQ ML K 
QVAVPQPPRLE PQEPNSATSTTIAVYWSMNKEDVIDS FQVYCME 
EPQDDQEVNELVEEYRLTVKES YCI FEDLEPDRCYQVWVMAVNF 
TGCS L PS ERA I FRTAP STP VI RAED CTVCWNTAT I RWRPTTPEA 
TETY TLEYCRQHS PEG EGLRS FS G I KGLQLKVNLQPNDNY FFYV 
RAINAFGTS EQSEAALISTRGTRFLLLRETAHPALHISS SGTVI 
SFGERRRLTEIPSVLGEELPSCGQHYWETTVTDCPAYRLGICSS 
SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 
VGXLLDYNNQRLIFINAESEQLLFIIRHRFNEGVHPAFALEKPG 
KCTLHLG IB PPDSVRHK 


5811 


1918 


851 " 


AAALADPLPEDKWSAEKRRPLKSSLGYE ITFSLLNPDP KSHDVY 
WD I EGAVRRYVQ P FLNALGAAGN FSVDSQI LYYAMLGVNPR FDS 
ASSSYYLDMHSLPHVINPVESRLGSSAASLYPVLNFLLYVPELA 
HS P L Y I QDKDGAPVATNAFHS PRWQG I M VYNVDS KTYNAS VLP V 
RVBVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WELDRLLWARSVENLATATTTLTSLAQLLGKISNIVIKDDVASE 
VYKAVAAVQKSAEELASGHLASAFVAS QEAVTSS ELAFFDPSLL 

HLLYFPDDQKFAIYIPLFLPMAVPILLSLVKIFLETRKSWRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASLEKIADPT 
LAEMGKNLKEAVKMLEDSQRRTEEEMGKfOilSGDIPGPlOQGSGQ 
DMVS ILQLVQNIjMHGDEDEE pqs priqnigeqghmallghslga 
YISTLDKEKLRKIiTTRILSDTTLWLCRIFRYENGCAYFHBEERE 

giakicrlaihsryedfvvdgfnvlynkkpviylsaaarpglgq 
ylcnqlglpfpclcrvpcntvfgsqhqmdvafleklikddierg 
rlplllvanagtaavghtdkigrlkelceqygiwlhvegvnlat 

LALGYVSSSVLAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 
LTLVAGLTSNKPTDKLRALPLWLSLQ YLGLDGFVER I KHACQLS 
QRLQESLKKVNYI KILVEDELSSPVWFRFFQELPGSDPVFKAV 
PVPKMTPSGVGRERHSCDALNRWLGEQLKQLVPASGLTVN5DLEA 
EGTCLRFSPLMTAAVLGTRGEDVDQLVACIESKLPVLCCTLQLR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
GENIHAGLLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVETIAATARE I EDNSRLLENMTE WRKGIQEAQVELQKAS 
EERLLEEGVLRQIPVVGSVLNWFSPVQALQKGRTFNLTAGSItES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(As* Alanine . C=Cva teini* n-Kena^^ t\ n < ^* » 
* * & * n» _, "7»*-caiici u~Asparcic Acid , E~ 

Glutamic Acid, F=» Phenylalanine, G^Glycine, 

H^Histidine, I=Isoleucine, K=Lysine, 

L=Lsucine, M=Methionine, N«*Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S=Serine, T»Threonine, V^Valine, 

W-Tryptophan, Y=Tyrosine, X=Unknown, *sStop 

Codon, /=pcssible nucleotide deletion, 

\=possible nucleotide insertion) 








tepiyvykaqc^vtlpptpsgsrtkqiu,pgqkpfkrslrg'5da~ 
lse'i'ssvshiedlekverlssgpeqitleassteghpgapspqh 

TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


699 


HRDGVSGSLERPLTDRSRTGAPAO^RGKMATAGGGSGADPGSRG " 

LLRLIjSFCVLLAGLCRGNS VERK I Y I PLNXTA PCVRLLNATFQ I 

GCQS S ISGDTGV I HWE KEEDLQWVLTDGPNP P YMVLLE S KHFT 

RDLMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 

SNSYGPEFAHCREIQWNSLGNGLAYEDFSFPIFLLEDENETKVI 

KQCYQDHNLSQNGSAPTFPLCAMQLFSHMAWLSFSTAT\CMRRS 

SIQSTFSINPKIVCDPLSDYNVMSMLKPINTTGTLKPDDRVWA 

ATRLDSRSFFWNV\APGAESAVASFVTQLAAAEALQKAPDVTTL 

PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEL 

\»v » a djjci unrin i utr v oyiuv tso VKNQVEDLliATLBKSGAGVP 

AVILRRPNQSQPLPPS SLQRFLRARN ISGWLADHSGAFHNKYY 

QS IYDTABNIWVS YPEV7LEPLKE/ETWNFG* QDTAKALADVATV 

LGRAtiYELAGGTNFSDTVQADPQTVTRLLYG\FLIKANNSWFCS 

ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VLQYALANL 

TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPUISNETDRLP 

RCVRSTARLARALSPAFELSQWSSTEY3TWTESRWKDIRARIFL 

IAS KELELITLTVGFG ILI FSL I VTYCINAKADVLFIAPREPGA 
VSY 


5814 


8500 


432 


ALKCRPRRVIiAI LVGP VQPDRMAE EG AVAVCVRVRPLtf SRkESL 

GBTAQVYWKTHNNVIYPVDGSKSFNFDRVLHGNETPKNVYEAM 

AAP 1 1 DSAIQGYNGTI FA\ YGQT\ASGKTYTMMGS EDHLG VI PQ 

GQ FHGH PSQK I * EVFLDRE FI>LR VS YME I YNBT I TDLLCGTQKM 

KPLIIREDVNRNVYVADLTEEWYTSEMALKWITKGEKSRHYGE 

TKMNQRSSRSHT I FRM I LESREKGEPSNCEGSVKVSHLNLVDLA 

GSERAAQTGAAGVRUKEGCNINRSLFILGQVIKKLSDGQVGGFI 

NYRDS KLTR ILQNSJuGGNPKTRI ICTI TPVSFDET.LTALQFAST 

AKYMKNTP YVNE VS TDEALLKR YRKE IMDLKKQLEE VS LE TRAQ 

AMEKDQIiAQLLEEKDLLQKVQNEKlENLTRMLVTSSSLTLQQEL 

KAKRKRRVTWCLGKINKMKNSITYADQFNIPTNITTKTHKLSINL 

LRE I DE S VCS ESDVFSNTLDTLSEI E WN P ATKLLNQEN IBS ELN 

SLRADYDNLVLDYEQLRTEKEEMELKLKEKNDIiDEFEALERKTK 

KDQBMQL IHE I SNLKNLVKHRE VYNQDLENELS S KVEIiLRE KED 

QIKKLQEYIDSQKLENIKMDLSYSLESIEDPKQMKQTLFDAETV 

ALDAKRESAFLRS ENLELKEKMKELATTYKQMEND I QLYQSQLE 

AKKKMQVDLEKELQSAFNEITKLTSUDGKVPKDLLCNLELEGK 

ITDLQKELNKEVEENEALRfiEVILLSBLKSLPSEVERLRKEIQD 

KSEELHIITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 

SNYKSTDQEFQNFKTLHMDFEQKYKMVLEENERMNQEIVNLSKE 

AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 

NRDSPLQTVEREKTLITEKLQQTLEEVKTLTQEKDDLKQLQESL 

QrERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 

KISEEVSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 

TAD VKDNE I IEQQRKI FS L IQEXNELQQMLES V I AEKEQLKTDL 

KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIICKEGELSR 

TCDRLAEVEEKLKEKSQQLQEKQQQLLNVQEEMSEMQKKINEIE 

NLXNELKNKELTLBHM E TER LE LAQKLNEN YE EVKS I TKERKVL 

KELQKS PETERDHLRG Y IREI EATGLQTKEELKIAHI HLKEHQE 

T I DELRRS VS EKTAQI INTQDLE KSHTKLQE E I PVLHE EQELLP 

NVKKVSBTQETMNELELLTEQSTTKDSTTLARIEMERLRLNEKF 

QES QEE I KS LTKERDNLKTIKBALE VKHDQL fCEH IRETLAK IQ E 

SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEIEMLG 

LSKRLQESHDEMKSVAKEKDDLQRLQEVLQSESDQLKENIKEIV 

AKKLETEEELKVAHCCLKEQEETINELRVNLSEKETEISTIQKQ 

LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 

KAKDSALQSIESKMLELTNRLQESQEEIQIMIKEKEEMKRVQEA 

LQIERDQLKENTKEIVAKMKESQEKBYQFLKMTAVNETQEKMCE 

I EHLKEQFETQKLNLEN I ETEN IRLTQILHENLEEMRS VTKERD 

DIJISVEETLKVERDQIiXENLRETITRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleflt" "» A** 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I»Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QBTIDKLRGIVSEKTNEISNMQKDLEHSNDALKAQbLKfQEELR 
IAHMHLKEQQETIDKLRGIVSEKTDKLSNMQKDLENSNAKLQEK 
IQELKANEHQLITLKIOVNETQKKVSEMEQLKKQIKDQSLTLSK 
LEIENLNLAQKLHENL2EMKSVMKERDNLRRVEETLKLERDQLK 
ESLQErTKARDLEIQQELKTARMLSKEKKETVDKLREKlSEKTIQ 
XSDIQKDLDKSKDELQKKIQELOKKELQLLRVKEDVNMSHKKIN 
EMEQLKKQPEPNYLCKCEMDNFQLTKKLHESLEEIRIVAKERDE 
LRR I KE S LKMERDQF I ATLREM I ARDRQNHQ VKPE KR LL SDGQQ 
HLMESLREKCSRIKELLKRYSEMDDHyECLNRLSLDLEKEIEFH 
R I M KKLKYVLS YVTK I KEEQH E C I N KFEMDF I DE VE KQKELL I K 
IQHLQQDCDVPSRBLRDLKLNQNMDLHIEEILKDFSESEFPSIK 
TEFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKNEKLF 
KN YQTLKTS LAS GAQVNP TTQDN KN P HVTSRATQLTTE KIRELE 
N S LHEAKE S AMH KE S KX I KMQKE LEVTUD 1 I AKLQAKVHE S N KC 
LE KTKET I QVLQDKVALGAKP Y KE E I EDLKM KLGKI DLE KMKNA 
KEFEKEISATKATVBYQKEVIRLLRENLRRSQQAQDTSVISEHT 
DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 
EQIiIKQKNELLSMNQHLSNEVKTWKERTLKREAHKQVTCENSPK 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 
LPS PHP VRYFDNS S LGLCP EVQNAGAESVDS QP\GPWARL FQGK 
DVP\ECKTQ 


5315 


23 


1460 


S ELVM WTVQNRESLGLLS F PVM ITM VCCAHSTNEPSNMS Y VKET 
VDRLLKGYD I RLRPDFGGP PVD VGMR I DVAS I DMVSEVNMDYTL 
TMYFQQSWKDKRLSYSG I PLNLTLDNRVADQLWVPDTYFLNDKK 
S FVHGVTVKNRI4IRI^PIX;TVLYGLRITTTAACMMDLRRYPLDE 
CNCTLEIES YGYTTDD I E FYWNGGEGAVTGVNKI ELFQFS I VDY 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
SWVSFWINYDASAARVALGITTVLTMTTISTHLRETLPKIPYVK 
A ID I YLMG CFVFVFIiALL E YAFVNY I F FGKG PQ KKGAS KQDQ5A 
we, ftwiujDnw j\ vy vut\tv*N I L»Li5TLE I RNETSGSEVLTSVSDPKA 
TMYSYDSASIQYRKPLSSRE\A*GRAPDRHGVPSKGRIRRRAS\ 
QLKVKIPDLTDVNSIDKWSRMFFPITFSLFNWYWLYYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP^ 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADB I CKRLAPDS RL 
NPHRSLU5TGNYDVNVIMAAI^Gr^IiAAVWWDRRRPLSQLAL?Q 
v xjujj x -uNi-jfo rva LAaijua IiJ-IiKKKxi LKW P CARL / VT vS YYNLDS 
K\LRAPEGPGGLRTE\ *G PFLAAALAQGLCBVLLWTKEVEEKG 
SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 
VMSNTTVPNAPQANSDSMVGYVLGPFFLITLVGVWAWMYVQK 
KKRVDRLRHHLLPMYSYDPAEELHEAEQELLSDMGDPKW\QAG 
RVATSTSG CHCWMS RRDLTPLPH PS E PG VLDCLG P CHLLP LL S P 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHLCFTj 


5818 


3 


3318 


QALR DKLWIFLVQS F YAVR HTES WKLMS TDbQQ K I QAAAFDKGD ~ 
DRRLGKKP I FSS S QQRKQ VSDSGD I KI KS WRGNNKKE CWS YLS T 
NKKMKSDGLGASGHSSSTNRNSINKTLKQDDVKEKDGTKIASKI 
TKELKTGGKNVSGKP KTVTKS KTENGDKARLENMS PRQWERSA 
TAAAAATGQKNLLNGKGVRNQEGQISGARPKVLTGNLNVQAKAK 
PLKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKLS PCDSPGQMMKNS VDSVKNST VAI KSRPVSR VT 
NGTSNKKS IHEQDTNVNNSVLKKVSGKGCS EPVPQAILKKRGTS 
NGCTAAQQRTKS TP SNLTKTQGS QGES PNS VKSS VS S RQS DENV 
AKLDHNTTTEKQAP KR KMVKQ VH T AL PKVNAXI VAM P KNLNQ S K 
KGETIjNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNNNKDSVSEQKPHKPLINLASEISDAEAIiQSSCRP\DPQK 
PLNDQEKEKLALECQNISKLDKSLKHELESKQICLDKSETKFPN 

hketddcdaanicchsvgsdnvnskfysttalkymvsnpnensl 
nsnpvcdldsrsagqihlisdrenqvgrkdtnkqss i kcv3dvs 
lcnpertngtlnsaqedkkskvpvegltipsklsdbsamdedkh 
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SEQ 
ID 
NO: 


~ Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F»Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
LsLeucine, M=Methionine, N«Asparagine, 
P- Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNS PKNMETSESPESHBTPETPFVGH 
WNLSTGVLHQRESPESDTGSATTSSDDIKPRSEDYDAGGSQDDD 
GSNDRG I S KCGTMLCHD FLGRS 9 S DTS T PEEL KI YDSNLR I E VK 
MKKQSSNDLFQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 
GSVQFAQEIDQVSSSADETEDERSEAENVAENFSISNPAPQQFQ 
G 1 1 NLAFEDATENECRE FS AN KKF KRS VLLS VD ECBE LGSDEGE 
VH T P FQAS VDS FS PSD VFDG I SHEHHGRTC YSRFS RES EDN ILE 
CKQNKGNS VC KNEST VLDLS S I DS SRKNKQS VSATEKKNT I DVL 
SSRSRQLLREDKKVNNGSNVENDI QQRS KFLDSDVKSQERP CHL 
DLHQRBPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 
ANIALSAGDIDDCDTLAQTRMYDHRPSKTLSPIYEMDVIEAFEQ 
KVES3THVTDMDF*DDQHFAKQDWTItLKQIjLSEQDSNIiDVTNSV 
PEDL SLAQ YL INQTLLLARDS S KPQG I TH I DTLNRWS ELTS PLD 
SSAS ITMAS FSS EDCS P QGE WT ILELETQH 


5819 


1 


5557 


AAAGLLGALHLVMTLWAAARAEKEAFVQSES I IEVLRFDDGGL 
LQTETTLGLSSYQQKSISLYRGNCR?IRFEPPMLDFHEQPVGMP 
KME KVYLHNPSS E * T I TLVS I FATTS HFHAS FFQNRK I L PGGNT 
S FD VS / VFLARWGNVENTLF INTSNHGVFTY \ Q VFG VG VPNP Y 
RLRPFLGARVTVNSSFSPIINIHNPHSEPLQWEMYSSGGDLHL 
ELPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 
TNASDSTEFI ILPVEVEVTTAPGI YSSTEMLDFGTLRTQDLPKV 
LNLHIiLNSGTKDVPITSVRPTPQ\NDAITVHFKP ITLKAS \ESK 
YTKVAS IS PDAS KAK KPS QFSGKI TVKAKE KS YS KLE I P YQAE V 
LDGYLGFDHAATLFHI RDS PADP VERP I YLTNTFS FAIL IHDVL 
LPEEAKTMFKVHNFSKPVLILPNESGYIFTLLFMPSTSSMHIDN 
NILLITNASKFHLPVRVYTGFLDYFVLPPKIBERFIDFGVLSAT 
EASNILFAIINSNPIELAIKSWHIIGDG\LSIELVAVDRGNRTT 
IISSLPECEKSSSSDQSSVTLASGYF\AVFRVKLTAKKL\EGIH 
DG AI Q I TTDYE I LT I P VK \ AVI AVGS LTCS P KHWL PP S FPGKI 
VHQSLNIMNSFSQKVKIQQIRSLSBDVRFYYKRLRGNKEDLEPG 
KKSKIANIYFDPGLQCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 
WDADWDLHQSLFKGWTGI KENSGHRLS AI FBVNTDLQKNI I SKI 
TAELSWPSILSS PRHLKFPLTNTNCSS \EEEITLENP /SQDVPV 
YVQFI PLALYSNPSVFVDKLVSRFNIiSKVAKlDLRTLEFQVFRN 
S AHPLQS STGFMEG\ LS PHL I LNL I LKPGEKKS VKVK\ FTP VHN 
RTVSSLI I VRNNLTVMDAVMVQGQGTTENLRVAGKLPGPGSSLR 
FKITEALLKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIE 
ISGYSCEGYGFKWNCQEFTLSANASRDIIILFTPDFTASRVIR 
ELKFITTSGSEFVFrLNASLPYHMLATCAEALPRPNWELALYII 
ISGIMSALFLLVIGTA\YLEAQGIWBP\FRRRLS\FEASNPPFD 
VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGFCGAGGSSSRPSA 
GSH KQ * GP S GHPHSS HSNRNS ADVDDVRAYNSGRTS SMTSAQAA 
SSQPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 
PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 
S SEDS D ITS LI EAMDKD FDHHDS PALE VFTEQPPS PLPKS KGKG 
KPLGRKVKP PKKQE E KEKKGKGKPQEDE LKDS LADDDS S STTTE 
TSNPDTEPLLKEDTEKQKGKQAMPEKKESEMSQVKQKSKKLLNI 
KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 
S RNAQ KTKGTSKLVDNRPPALAKFLPNSQELGNTS SS EG EKDS P 
PPEWDSVPVHKPGSSTDSLYKLSLQTLNADrFLKQRQTSPTPAS 
PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 
S L PGKNGNP T FAAVTAG YDKS PGGNGFAKVS SNKTGFSSS LG I S 
HAP VDSDGSDSSGLWS PVSNPSS PDFTPLNSFSAFGNS FNLTGE 
VFS KLGLS RS CNQASQRS WNE FNSGPS YLWES PATDPS PS WPAS 
SGSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 
TLASIGLMGTENS PAPHAPSTSS PADDLGQTYNP WRI WS PTIGR 
RSSDPWSNSHFPHBN 


5820 


310 


1270 


RVSLSGPVSLGVLLCARSSTMGKRDNRVAYMNPIAMARSRGPIQ 
SSGPTIQ\VI*IDQGLPGKK*KSN*KRKRK/DSKALAEFEEKMN 
ENWKKELEKHREKLLSGSESSSKKRQRKKKEKKKSW*\DSSSS\ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETES 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 

/ v.-uysi.eine ( u=AspartXC Acid, E» 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
K*Histidine, I»Ieoleucine, K=Lyaine, 
L-Leucine, M«Methionine, NaAsparagine , 
P=»Proline, QsGlutamine,. RaAminiru* 
S=Serine, T=Threonine, VoValine, 
WnTryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKEKDIKGLSKKRKMYSEDKPLSSESLS 
ESEYIEEVRAKKKKSSEEREKATEKTKKKKlCHlOfHQKifV'K'fcrvan 

SSSPDSP*K*EKSGFPYKESAMSEEISTVKTTrYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5821 


179 


915 


KWRNQSWRWPKPGTNWMLSC^VC^RfeVtWTGSVWMRia/SKHPOT""" 

PT/IKDCSIAATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 

TYVIKLFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 

SPLPPLPEDEEG\SEVTNSKSR*CVQACPPTHTPGGQPKNACR\ 

SR I PS P LAALRMQGT P*RWSPFEPEPSPSTLI YRNMQR W KR X RQ 

RWKEASHRNQLRYSESMKILREMYERQ 


5B22 


464 


4379 


QTLKEMPIVriARDLESTASSSEDEEVISQEDHPCIMWTGGCRRj: 
PVLVFHADA I LTKDNN I R V I G ER YHLS YK I VRTDS RLVRS I LTA 
HGPHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHPPRSYE 
LTRKDRLYKNIIRMQHTHGFKAFHILPQTFLLPAEYAEFCNSYS 
KDRGP W I VKP VASS RGRG \ VYL INNPNQI S L E EN I LVS R Y I NNP 

lliddf:<fdvrlyvlvtsydplviyi,yeeglarfatvrydqgak 

NIRNQFMHLTNYSVNKKSGDYVSCDDP3VEDYGNKWSMSAMLRY 
LKQEGRDTTALMAHVEDLI I KTIISAELAIATACKTFVPHRSSC 
FELYGFDVLIDSTLKPKLLEVNLSPSLACDAPLDLKIKASMISD 
MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 

sdabmknlvgsarekgpgklggsvlglsmeeikvlrrvkeendr 

RGGFIRXFPTSETWEIYGSYLEHKTSMNYMLATRLFQDRMTADG 
APELKI * S LNS KAKLHAAL YERKLLS LE VR KRRRRS S RLRAMRP 
KYPVITQPAEMNVKTJSTESEEEEEVALDNEDEEQEASQEESAGF 
LRENQAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETQE 
LEPKFNLMQILQDNGNLSKMQARIAFSAYLQHVQI\RLMKDSGG 
QT FS AS WAAKEDEQMEL WRFLKRASNNLQHSLRM Vli PSRRLAL 
LERTR I LAHQLGDF 1 1 V YNKETEQMAEKKSKKKVEEEEEDGVNM 
ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN 
NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIHSDKLSRFTTSA 
EKEAKLVYSNSSSGPTATLQKIPNTHLSSVTTSDLSPGPCHHSS 
u£»ux Ai, fbM fhq PTI LLNTVS ASAS P CLH PGAQN I PS P TGL P 
RCRSGSHT I GP FS S FQ S AAH I YSQKLS R PS SAKAGS C YLNKHHS 
G I AKTQ KEGEDAS L YS KR YNQSMVTAE LQRLAEKQAARQ YS PS S 
HINLLTQQVTNLNIiATGIINRSSASAPPTLRPIISPSGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VPKPPPNHEQVLRRATSQKASKGSSAEGQLNGLQSSLNPAAFVP 
ITSSTD PAHTK I MNHKHTEKQ P VHHS W VHD 


5823 
5824 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWSPLAGEICFVEVYKEAHLLALHIESSSRNQAAQAAKP 
BDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLLGP P VGEPRLLAS S PALPS SGAQARLTRAPG P PHSAHAL P 
RESCTAHAASQAATQRXPGTKLLLPRAAS VRGRGI PGAAEKPXK 
EIPASPSRTKI PABKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFASI PAN*LPGLCPNI SKS \GRMGPAMbRPA 
\ r 'vajrvo \<*^onU/iJsJtvuvobljAAEQIjTAPP\SASPTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSS TPKLS RAQRPQS CTS VGRVT VHS TP VR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL \ CVPARRRS SE PRKNS AMRTE PTRES NRKTDSR \ L VDVS PDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPLI DLPL I DFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 




42 


2293 


LLTALSMEGGGGRDBPSACRAGDVNMDDPKKEDII^IJUJEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
rS BS PFAWS PLAGEKFVE VYKEAH LLALH 1 ESS SRNQAAQAAKP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, CaCysteine, DeAspartic Acid, E*» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N*Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDPRSQGVERFIQESKF\KINLFEKEKEMECKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQ RKPGTKLLLPRAAS VRGRG I PGAAEKPKK 
E I PAS PS RTK I PAEK E SHRDVLPDKFAPGAVNVPAAGS HLGQGK 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQFVAKAKSSEFAS IPAN* LPGLCPNI SKS\GRMGPAMLRPA 

xj \rnur v\j \^\OOWy>4JVKVJJ VOCiuAAbUL i Arr \SASPTQPQTPE 

GGG \QWLNSS CAWSES SQLNKTRS IRRRDSCLNS KTKVMPTPTN 
QFKI PKFS IGDS \PDS STPKLSRAQRPQS CTSVGRVTVKSTP VR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GS PPS RVPQALNFS PE ES DSTFS XSTATEVAREEAKPGGDAA P S 
BALLVDI KLEPLAVTPDAASQPLI DLPL IDFCDTPEAHVAVGSE 
SR PL IDLMTNTPDMNKNVAKPS PWGQL I DLSSPLIQLS PEADK 
ENVDSPLLKF 


5825 


2 


4210 


flqiesaspXpfssgflaahphspggslatkgrsrlsapgmlhl 

SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGALERSFVEL 
SGAERERPRHFREFTVCS IGTANAVAGAVKYSESAGGFYYVESG 
KLFSVTRNRFIHWKTSGDTLELMEES LDI NLLNNAIRI>KFQNCS 
VLPGGVYVSETQNRVI I LMLTNQTVHRLLLPHPSRMYRSEL\A7D 
S QMQS I FTDI GXVD FTDPCNYQL I PAVPG I S PNS TASTAWLS SD 
GEALFAL P CAS GG I F VLKLPP YD I PGMVS WE LKQS S VMQRLLT 
GWM PTAI RGDQS PSDR PLS LAVHCVEHDAF I FALCQDH KLRM WS 
YKEQMOLMVADMLEYVPVKIODLRLTAGTGHKLRLAYSPTMGLYIi 
GIFVMHAPKRGQFCIFQLVSTESNRYSLDHISSLFTSQETLIDF 
ALTSTDIWALWHDAENQTVVKYINFEHNVAGQWNPVFNIQPLPEE 
EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERKL 
DLSV?SELKKEVTLAVENELQGSVTEYEFSQEEFRNLQQEFWCKF 
YACCLQYQEALSHPLALHLNPHTNMVCLLKKG YLSFLI PS S LVD 
HL YLLP YENLLTEDETT I S DDVDI ARDVI CLI KCLRLI EESVTV 
DMSVIMEMSCVKLQSPBKAAEQILEIDMITIDVENVMEDIC 
EIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 
GSNTAG YI VCRGVHKI ASTRFL I CRDLL I LQQLLMRLGDAV I WG 
T6QLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LQHLS VLELTDSGALMANR FVSS PQT I VELFFQ EVAR KH 1 1 SHL 
FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 
GNCQYVQLQDYIQLLHPWCQVNVGSCRFMLGRCYLVTGEGQKAL 
ECFCQAASEVGKEEFLDRLIRSEDGEIVSTPRLQYYDKVLRLLD 
w AULir ctu v x UiJrt I Sftl 1 1J JJW \ K£» QATJj \ RTCI FKHHL\ DLG 

\HNSQAYGSL* PQI PDSSRQLDCLRQLVWLCERSQLQDLVEFS 
YVNLHNEWGI IESRARAVDLMTHNYYELLYAFHI YRHNYRKAG 
TVMFE YGMRLG RE VRTLRGLEKQGNC YLAALNCLRLI R PEYAWI 
VQP VSGAVYDR PGAS PKRNHDGECTAAPTNRQ I EI LELEDLEKE 
CSLAR I RLTLAQHD ? S AVAVAG S SS AEEM VTLLVQAGL FDTA I S 

LCffrFKLPLTPWEGLAFKCIKLQFGGEAAQAEAWAWLAANQLS 

SVITTKESSATnF&WRT.T.QTVT.cwvir^rrkvrMT vuuputmvt t a«m 
■* * x - 1 -" J -'" AiyCkrt.rn^LU-io x l J^aKxx\.vywwijinHL VXNKIiLSHG 

VPJjPNWL INS YKK VDAAELLRL YLNYDLLDI»TP YQ VTR ICGC 


5826 


3 


871 


ksqllrdhsapppkpctsvgamgc+prq/spkeqqrqlkkqknr 

AAAQRSRQKHTDKADALHQQHESLEKDNLAIjRKEIQSLQABLAW 
WSRTIjHVHERLCPMDCASCSAPGIiLGCMDQAEGLLGPGPQGQHG ' 
CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 

awaeppvqlspspllfashtgsslqgsssklsalqpsltaqta 

PPQPLELEHPTOGKLGSSPDNPSSALGIjARLQSREHKPALSAAT 

wqglvvdpsphpuafpllssaqvhf 


5827 


194 


2287 


GMGSENSALXSYTLREPPFTIiPSGDAVYPAVLQDGKFASVFVYK 
RENEDKVNKAAKVP* *HLKTLRHPCLLRFLSCTVEAIX3IHLVTE 
RVQPLEVALETIJSSAEVCAGIYDILIiALIFLHDRGHLTHNNVCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQSIRDPASIPP 
EEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
SFQQTLHSTLLNPIPKWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKS EEE KTE FFKFLLDRVS CI»S EEL IAS RL VPLLLNQL VFAE P 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
ift-hianxne, ^»^.ys teine , jj=Asparcic Acid, E= 
Glutamic Acid, F*Phenylalanine, G^Glycine, 
H»Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV \KSFLP YLU3PKKDHAQGETPCLLS PAL FQSRVI PVLLQIiP 
EVHEEHVRMVLLSHI KAYVGALSIiREOLina/V TTA nmn t r*\ t -a 

D \ TSDS I VAITLHS LAVLVS LLG PEVWGGERTK I FKRTAP \ S F 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCITLDV 
EESSWDDCEPSSLDTKVNPGGGITATKPVTSGEQKPIPALLSLT 
BESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSEL 
GLiGEEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPEXiRT 
EMVPKKDDVSPVMQFSSKFAAAEITEGEAEGWEEEGELNWEDNN 
W 


5828 


2 


257 


AREGGSLGAVAACGELSYSCDFCPARPHTSWjTRFVKMEFQAVV 
MAVGGGSRMTDLTSS I PKPLLPVGNKFLIWY?LNLLERVGFEEV 
IWTTRDVQKALCABFKMKMKPDIVCIPDDADMGTADSLRYIYP 
KLKTDVLVLSCDLITDVALHEVVDLFRAYDASLAMLMRKGQDSI 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GS ILQKHPRIRFHTGLVDAHLYCLKKY I VDFLMENG \ S ITS IRS 

BL\IPYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
SPY* KEANYTETfiAP V\ Ti\ fcPWT 


5829 


260 


1259 


PDGRLI VSCSEDKT I KI WDTTNKQCVNNFSDSVGFANFVDFNPS 
GTCIASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNCISFHPSGN 
YIiITASSDGTLKILDLLKGRLIYTLQGHTGPVFTVSFSKGGELF 
ASGGADTQVL LWRTNFDELHCKGLTKRNL KRLHFDS P PHLLD I Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR*SICRSLLPLLWISF 
Lhl LPQQQKPWGLCQTRVKRPVDIS *TIiP*CHQNVCQQPRKRK 
QKT*VTSPVKVK/VSIPIAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 


583"(5~ 


4496 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
NIEAAVQDRLNEQEGVPSVFNPPPSRPLQVNTADHRIYSYWSR 
cytrtujijLteiKKjx XLixwiif rKr IX 1 IIJjDIFRFALRFIRPDPRSRV 
TDPVGD I VS FMHS FEBKYGRAHPVFYQGTYSC2ALNDAKRELRFL 
LVYLHGDDHQDSDEFCRNTLCAPEVISLINTRMLFWACSTNKPE 
GYRVSQALRENTYPFLAMIMLKDRRE* PV\VGRLEGLI \QPDDL 
INQLTFIMDANQTYLVSERLEREERNQTQVLRQQQDEAYliASLR 
ADQEKERKKRBERERKRRKKEEVQQQKLAEERRRQNLQBEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTIiQE\A 
GLSHTEVLFVQDLTDE 


5831 


71 


2897 


FCSKDKCCLYLPDSINRSKSCrAKPGAHSQDRHAVMD^dRQVKD 
TDDIESPKRS IRDSGYIDCWDSERSDSLSPPRHGRDDS FDSLDS 
FGS RSRQTPS PD WLRGS S DGRGSDS ES DL PHRKL PDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDAE5TSMFDMRC3E 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDLI KKEEERKKMEKLLAGEDGTSERRKS I KTYRBI VQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKFTATVETTIARAS 
VLDTSMS AGSGS PS KTVTPKAVPMLTPXPYSQPKNSQDVLKTFK 
VDGKVS VNGET VHR E EE KER ECPTVAP AHS LTKS QM FEG VARVH 
GSPLELKQDNGSIEINIKKPNSVPQELAATTEKTEPNSQEDKND 
GGKS R KGNIELAS S E PQH FTTT VTRCS PTVAFVE EPS S PQLKND 

KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQERIjLQERYQ\KEQDK\LKEE\WEKAQKEVEEEBRRY 
YEBEP+ 1 I\EDPWPFTVSSSSADQLSTSSSMTEGSGTMNKIDL 
GNCQDEKQDRRWKKSFQGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNP VS KGVHEDHQLDTE AGAPHCGTNPQIAQDP SQNQQTSN 
PTHSS EDVKPKTLPLDKS INHQ I ES PS ERRKS ISGKKLCSS CGL 
PLGKGAAM I IETLNL YFH IQ CFRCG\ I CKGQLGDAVSGTDVRIR 
NGLLKCNDCYMRSRSAGQPTTL 


5832 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSrJDFSN " 
SENI^KLEKI^SSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
iftB/iianinc, L-Lyaceine, ussASpartJLC Acid, E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H«Histidine, i=lsoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyroeine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDSSGSVSPNTLSQEEGDQICIiyHIRKSCSFQDKCH 
RVHFHLPY RWQb'LDRGKWEDLDNMELI BEAYCNPKI ER I LCSES 
ASTFHSHCLNFNAMTYGATOARRLSTA<5SVTlf PDUI?T T 'rTntLTTM 

YWSDEFGSVJQEYGRQGTVHPVTTVSSSDVEKAYLAY/WYTGV*R 
PG SHLEVPGRKAQLRVRFQS LRS E KPGLWHN* KGLPQTQ I R\ AP 
QDVTTMQTCNTKFPGPKS I PDYWDSSALPDPGFQKITLSSSSEE 
YQKVWNLFNRTL P FY FVQKI ERVQNLALWEVYQWQKGQMQKQNG 
GKAVDERQLFHGTSAI FVDAI CQQNFDWRVCGVHGTS YG KGS Y F 
ARDAAYSHHYSXSDTQTHTMFLARVLVGEFVRGNASFVRPPAKE 
GWSKAFYDSCVNS VSDPS I FVI FEKHQVYPEYVIQYTTSS XPS V 
TPSILLALGSLFSSRQ 


5833 


170 


3289 


£ I LC L LS P C WQ FGKP WS I LS S R S RHS PCTKKGWEGMR KHLHT 
RQGHK*VHVEISKALWVYRDDYFIRHSISVSAVIVRAWITHKYR 
GRDWNVKWEENIiLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
GY I WNLRANR I PQCPLEND WALLGFPYASSGENTGI VKKFPRF 
RNRELEATRRQRMDYPVFTVSLWLYLLHYCKANLCGILYFVDSN 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
DI S FNGGQI WTTS IGQDLKS YHNQT ISFREDFH YNDTAGYFI I 
GGSRYVAGIEGFFGPLKYYRLRSLHPAQIFNPLLEKQLAEQrKL 
Y YERCAEVQE I VS WASAAKHGGERQEACHLHNS YLDLQRR YGR 
PSMCRAFPWEKELKDKH PS LFQALLEMDLLTVPRNQNESVS E IG 
GKIFEXAVKRLSSIDGLHQISSIVPFLTDSSCCGYHKASYYLAV 
FYETGLNVPRDQLQGMLYSLVGGGGSERLSSMNLGYKHYtfSIDN 
YPI#DWELSYAYYSNIATKTPLDQHTL»QGDQAYVETIRTiKDDE I L 
KVQTKEIX3DVFMWLKHEATRGNAAAQQRIAQMLFWGQQGVAKNP 
EAAIEWYAKGALETEDPALIYDYAIVLFKGQGVKKNRRLALELM 
JS-R-ftAb KblinQA v NGLG WY Yn KFKKNYA\KAAKYWLKA\EE\MGN 
PDASYNLGVLHLDGI FPGVPGRNQTLAGEYFHKAAQGGHMEGTL 
WCSLYYITGNLETFPRDPEKAWWAKHVAEKNGYLGHVIRKGLN 
AYLEGS WHEALL YYVLAAETGIEVSQTNLAHI CEERPDLARRYL 
GVNCVWRY YNFS VFQIDAPS FAYL KMGDLY YYGHQNQSQDLELS 
VQMYAQAALDGDSQGFFNLALLIEEGTIIPHHILDFLEIDSTLH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SAL I YFI/3TFL LS I LI AWTVQ YFQS VS ASD P PPRPSQAS PDTAT 
STAS PAVTPAADASDQD Q PT VTNNPE PRG 


"■' 5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/QECG ~ 

SAAPGPIPGQSSS^VPLRLEQIQQKADCPLSLELALKPRMAAQV 

TLBDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 

NAFVTGIARYIEQATVHSSMNEMLEEGQEYAVMLYTWRSCSRAI 

PQ VKCN EQPNR VE I YE KTV E VLE PE VTKLMN FMY FQRNAI ERFC 

GEVRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 

KNDHSAYIO^AAQFLRKMADPQSIQESQNLSMFLANHNKITQSLQ 

QQLBVISGYEEU^IVNLCVDYYENRMYLTFSEKHMLLKVMGF 

GLYLMDGS VSNIYKLDAKKR INLS KI DKYFKQLQWPLFGDMQ I 

ELARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIREDHM 

RFISELARYSNSEVVTGSGRQEAQKTDAEYRKLFDLALQGLQLL 

SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 

EEKFALVEV I AM I KGLQVLMGRiMES VFNHAIRHT VYAALQDFS Q 

VTLMEPLRQAIKKKKNVIQSVLQAIRKTVCZDWETGHEPFNDPAL 

RGEKDPKSG*D I KVPRRAVGPSSTQLYrWRTMLESLIADKSGSK 

KTLRSSLEGPT ILD I EKFHRES FFYTHLINFSETLQQCCDLSQL 

WFREFFLELTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVL 

YSLDLYNDSAHYALTRFNKQFLYDEIEAEVNIXTFDQFVYKLADQ 

I FA Y YKVMAGS LL LDKRLR S E CKNQGAT IHLPPSNR YETLLKQR 

HVQLLGRSIDLNRLITQRVSAAMYKSLELAIGRFESEDLTSIVE 

LDGLLEINRMTHKLLSRYLTLDGFDAMFREANHNVSAPYGRITL 

HVFWELNYDFLPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 

QYLHGS KALNLAYSS I YGS YRNFVG PPHFQ VICRLLG YQGIAW 

MEELLKWKSLLQGT I LQYVKTLMEVM PKI CRLPRHE YGS PGIL 

EFFHHQLKDIVEYAELKTVCFQNLREVGNAILFCLLIEQSLSLE 

E VCDLLHAAP FQN I LPR VH VKEGE RLDAKM KRLE S KYAPLHL VP 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C»Cysteine, D=Aspartic Acid, E& 
Glutamic Acid, F= Phenyl alanine, G*=Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
L=Leucine, M«Methionine, N-Asparagine, 
P=Proline f Q=Glut amine, RoArginine, 

WoTryptophan, Y=Tyxosine, X=Un known, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIBRLGTPQQIAIAREGDLLTKERLCCGLSMPBVIL*TRIRSFLD 
DPIWRGPLPSNGVMHVDECVEFHRLWSAMQFVYCIPVGTHEFTV 
EQCFGDG LHWAGCMI I VLLGQQRRPAVLDFCYHIjLKVQKHDGKD 
EI I KNVPLKKMVERIRKFQI LNDE I IT ILDKYLKSGDGEGTPVB 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


sgnirmaqgshqidfqvlhdlrqkfpevpbvwsr04lqknnnl 
daccavlsqestrylygegdlnfsddsgisglrnhmtslkldlq 
sqniyhhgregsrmngsrtlthsisdgqlqggqsnselfqqepq 

TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNPIMVTIiAPNIQTGRNTPTSLHIHOVPPPVLNSPQGKSIYI 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASKPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 

sqssahsqyniqnistgprknqieikleppqrnnssklrssgpr 
tsstsssvnsqtlnrnqptvyiaasppntdblmsrsqpkvyisa 

NAATUDliQ VMRNQP TLF 1 S TNS GASAASRNMSG Q VSMGPAF IHH 
HPPKSRAIGNNSATS PRVWTQPNT\ E YTFKITVS PNKP PAVS P 
GWSPTFELTNLLN-^DHYVETENIHHLTDPTLAHVDRISETRK 
LSMGSDDAAYTQDI *RISNS WLGMVAHACNS SALGG QDGR 1 1+ A 
QEFETS WGNI WRLRLYRRF*NYAGMVAHTCSPS YSVD * ALLVHQ 
KARMERLQRELE IQKKKLDKLKS EVNEMENNLTRRRLKRSNS IS 
Q I PS LE EMQQLRS CNRQLQ I D I DCLTKE I DL FQARG PHFNPSAI 
HNFYDNIGFVGPVPPKPKDQRSIIXTPKTQDTBDDEGAQWNCTA 
CTFLNH PAL I R CEQCEM PRH F 


^3S 


361 


2303 


FHITMCGICCSVNFSAEriys^LKEDLLYNLKQRGPNSSKQLLK 
SDVNYQCL FSAHVLHLRG VLTTQP VE DE RGNVFLWNGE I FS G I K 
V2AEENDTQILFNYLS SCKN ES E ILSLFS E VQG PWS FI Y YQ AS S 
HYLWFGRDFFGRRSLLWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VPAS \ D FS E L I LS LLS FPDAIiF YNC I LGNI FLGR I LLKKML IA* 
VXFQQT YQHLYQR * QMKPNC I LKNLLFL * I * CCKKLHWRLI AV I 
FPMCHLQER YFKS FLLMYT * KEVIQQFI DVLS VAVKKR VLCLPR 
DENLTANE VLKTCDRKANVAI IiFSGG I DS MVI ATLADRHI PLDE 
PIDLLNVAFIAEEKTMPTTFWRSGNKQKNKCEIPSEEFSKDYAA 
AAADSPNKHVSVPDRITGRAGLKELQAVSPSRIWNFVEINVSME 
ELQKLRRTRICHLI R PLDTVLDDS IG CAVW FASRG I G WL VAQEG 
VKS YQSNAKWLTG IGADEQLAGYSRHRVRFQSHGLEGLNKEIM 
MELGRISSRI^LGRDDRVIGDHGKEARFPFLDENWSFLNSLPIW 
EKANLTLPRGIGEKLLLRIiAAVELGLTASALLPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSLENLSIBKETKL 


5837 


4792 


903 — 


NGNAVAQAP VTNCCYLATGSKDQTIRI WSCSRGRGVM I LKLPFL " 
KRRGGG I DPTVKERLWLTLHWPSNQPTQLVSS CFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGSIiAIGVGDG 
MIRVWNTLSIKNNYDVKNFWO^VKSKVTALCWHPTKEGCLAFGT 
DDGKVGLYDTYSNKPPQISSTYHKJCTVYTLAWGPPVPPMSLGGE 
GDRPSLALYSCGGEGIVLQHNPWKLSGEAFDIMKLIRDTNSIKY 
KLPVHTEI S WKADGKIMALGNEDGSIE I FQ\ IPNLKLI CTIQQH 
HKL VNTI S WHHE \HGS PAQKLS YL \MPSGSQQCS PFTCHNLKNC 
P * KAAPES PSDPLQS P YRTPPQGHTAQD YPVWAWEPH IH* WEGL 
VFCFP IDG YS PGCWD\AFPGKEAPVAI FRG\HQGRLLCVAWSPL 
DPDCI YSG \ ADDFCVHKWLTS MQDHSR P PQG K KS IE LEKKRLS Q 
PKAKPKKKKKPTLRTPVKLESIDGNEEESMKENSGPVENGVSDQ 
EGEEQARE PELPCGLAPAVSREP VICTPVSSG FEKS KVTINNKV 
ILLKKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKEELHQDCL 
VLATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMIDISGKG 
HLENGHP ELFHQLM LW KGDL KGVL QTAAE RG ELT DNL VAMAP AA 
GYHVWLWAVEAFAKQLCFQDQYVKAASHLLS IHKVYEAVELLKS 
NHFYREAIAIAKARLRPSDPVIiKDLYLSWGTVLERDGHYAVAAK 
CYLGATCAYDAAKV LAKKGDAA S LRTAAE LAAI VGE DELS AS LA 
IiRCAQELLLANNWVGAQ3AU3LHESLCX5QRLVFCLLELLSRHLE 
EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQY 
QEAFQKLQN I KYPSATNKTPAKQLLLH I CHDLTLAVLSQQMAS W 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co r re sp ond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H«=Histidine, I«Isoleucine, K«Lysine, 
L-Leucine, M«Methionine, N«Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRS YDSGS FT I MQE V YSAFLPDGCDHLRDKLGD 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEBTDPETSQPEPNRPSEIjDIiRLTEEGERM 
LST FKELFSE KHASLQNS QRTVAEVQETLAEM I RQHQKSQLCKS 
TANGPDKNBPEVEAEQPbCSSQSQCKEEKNEPLSLPELTKRLTE 
ANQRMAKFPESIKAWPFPDVLECCLVLLLIRSHFPGCIAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECJUDPAQRDLYRDVMLENYSNL 
ISLDLESSCVTKKLSPEKEIYEMES\PSGRIWGNVSTITFQYNG 
tiGDNMECKGNLEGQVSKSEGLYMCVKITCEEKATESHSTSSTFH 
RII /HYQGKIVKCKECRQGFSYLSCLIQHEENHNI* KCSEVNKH 
RNTFSKKPSYI*HQ\KFRIiGEKPYECMECGKAFGRTSDLIQHQK 
IHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQR1HSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKPYECKECGKAFILGSHLTYHQRVHTGEKPYICKECGKAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKECGKAFISNSNLIQHQRIHTGEKPYKCKECX3KAFICGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYLTQHEKIHGEKHYECKEC 
GKTFVRATQIiTYHQRIHTGEKPYKCKECDKAF/HLWLTILSEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGEKPYTCVQCGKDFRCPSQLTQHTRL 
HN*EYSSHKICMHSIALASLDFAHLQEKNPEN 


5839 


1 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAL 
EEVEGDVAELELKL\DKLVKLCIA\MIDTGKAFCVANKQFMNGI 
RD\ LAQNS \NNDA\ WETKFAPS FLDS LQEMINFHTIL/L* PNS 
EIN*GHS FQNFVKEDLRKFKDAKKQFE^SQ* KRKKIALVKNAPV 
PSRPASLEL* KP PNILTATRKCFRHIALDYVLQINVLQSKRRSE 
ILKSMLSFMYAHLAFFHQGYDLFSELGPYMKDLGAQLDRLVGDA 
AKEKREMEGKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RASNAFKTWNRRWFS IQNNQWYQ KKFKDN PTWVEDLRLCTVK 
HCEDIERRFCFEWSPTKSCMLQADSEKLRQAWIKAVQTS I \AT 
AYRBKDDESEKLDKKSS PS TGSLDSGNESKEKLLKGESALQRVQ 
CIPGNAS CCDCGLADPRWAS INLGITLCI ECSGIHRSLGVHFSK 
VRSLTLDTWEPELLKIjyiCELGNDVINRVYEANVEKMGIKKPQPG 
QRQEKBAYIRAKYVERKFVDKIFL*SLSPP\BQQKK\FVSKSSE 
EKRLS ISKFGP \GDQVRASAQSSVRSNDSGIQQS SDEX5RBSLPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPKM 
AEALAHG ADVNV7 ANS E ENKAT PL IQAVLGGSLVTCEFLLQNGAN 
VNQRDVQGRGPLHHATVLGHTGQVCLFLKRGANQHATDEEGKDP 
LSI AVE AANAD I VTLLRLARMNEEMRE S EGLYGQ P GDETYQD I F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHliPRQHLTTLWQI S S PRWRS PQRAFMSALSKTQTQSAPALQ 
GLSSLLQSVTGNPVPASEAASQSTSASPANTTVYTIKGRKLPSS 
AQPFI PKSFNYS PNSSTSEVSSTSASKAS IGQSPGLPSTAFKLP 
SNTKGFTATHNTS PAAPPTE VTICQSSE VSKPKL\ESESTS PS L 
\EMKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLMDSSQBKFYPDTSFQEDEDYRDFEYSGP 
PPSAMMNLQKKPAKSILKSSKLSDTTEYQPIIiSSYSHRAQEFGV 
KS AF P PS VRALLDS S ENCDRLSSS PGLFGAFS VRGNB PGS DRS P 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
LFS PQNTLAAPTGH P PTSG VE KVLAS T I STTST I EF KNMLKNAS 
RKPSDD KH FGQAPS KGT PS DGVSLSN LTQ P SLTATDQQQQB EH Y 
RIETRVSSSCLDLPDSTEBKGAPIETIiGYHSASNRRMSGEPIOT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 
SLGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LP PSPLEHGTPFQRB P VGPSSAPPVP PKDHGGI FSRBAPTHLPS 
VDLSNPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 
KDHSS LLQGTLAEH FGVLPGPRDHGG PTQRDLNGPGLS R VRESL 
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to first 
amino acid 
residue of 
amino acid 
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1 Predicted end 
nucleotide 
location 
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sequence 


Amino acid segment containing signal peptide" 
l-r * " J-c * liJ>1LCi » ^suysteine! U^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

J^LeUCine^ McMethion i nfi HrrJcnavnrtinA 

w « » ■ » * ^ i j «*• a« -A. 4-i c> f i>v^«*sparoigi ne , 
P=Proline, Q«Glutamine, R»Arginine, 
S«Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


" 5B41 






TLPSHSLEHLGPPHGGGGGGGSN&SSGPPIX3PSHRDTISRSGII 

LRSPRFDPRPREPFLSRDPFHSLKRPRPPFARGPPFFAPKRPFF 
PPRY 


5842 " 


1908 


762 


GLRLFLVLTVWPf^KPSWLSRTEP^Tn?T.T^PTT.cjr , nc^Mfc - enev — 
TRSMLKMTTS I NRRS RTS TKS TRTS ARPGLTATVS I G LS DS PTW 

rhckmtar s csgekgghwaprqvg vy llpgrvg cvs s rvs psfp 
gdgu>sglarrgsavsalasglveepmlgppfhptprfkavsak 

SKEDLVSQGFTEFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 

\vepmckesdhihiialaqglqrvhpgweymgprpraattnphi 

FP*GLPSPKVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRGFSVVVKWSYFTPFFLSHDPPPMFY 






1918 


QEPTADFKLRSTCGCGREMTCPDKPGQIjINWFICSLCVPRVRKL 
WSSRRPRTRRWLTjLGTACAI YLG FLVSQVGRAS LQHGQAABKGP 
HRSRDTAEPSFPEIPLDGTlAPPESQGNGSTLQPNVVYITIiRSK 

rskpanirgtvxpkrrkkhavasaapgqealvgpslqpqea\eg 

ixj nuu ^ uk£>u a "I'KjjabUi'GGWCGVRE/WRAGGPDFLQPSS 

resniriysesapswlskddirrmrlladsavaglrpvssrsga 

RIiLVLEGGAPGAVLRCGPSPCGLIiKQPLDMSEVFAFHLDRILGL 

nrtlpsvsrkaefiqdgrpcpiilwdaslssasndthssvkltw 
gtyqqllkq kcwqngrvpkpesg CTE I HHHBWSKMALFDFLLQ I 
ynrldtnccgfrprkedacvqnglrpkcddqgsaalahi iqrkh 

DPRHLV F I DNKG FFDRS EDNLNFKLLEGI KEFPAS AVY VLKS QH 

lrqkllqslfldkgywesqggrqgieklidviehrakilityin 

AHGVKVLPMNE 


5843 
£5344 


500 


1453 


gtarlvtcwvlhgq*vkkpawepgwwl*q*rcrpkgwglgagm " 
rgsrmsqppqclrraqsscchfmvkllddgtfmipgekvahtsl 

DALVTFHQQKPIEPRRBLLTQPCRQKDPANVDYEDLFIiYSNAVA 

eeaacpvsapeeaspkpvlchqskerkpsaem/rqnnhqgshfl 

lppkipswrdppetleepqnaprerpegpaaakkpprhcblwt 

lgcpeihgdlrpwdrkrqprslrgshlggqrlhgslcghisqkp 

ltapgtkrqkgphqegrevgqlh*gdprgqelapmgsespii,pg 
vqarapglgra 


5945 " 


202 


2471 


FDSAVLS S X NVMAVLPG PLQUjG VLLTI S LS S I RIjI qaga y yg I 
KPLPPQ I P PQMPPQ I PQ YQ PLGQQVPHM P LAKDGLAMGKEMPHL 
QYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKEIPLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPMGI P * PQGPPGPHGLPGIGK 
PGGPGL PGQPG PKGDRGPKGLPG PQGLRGP KGDKGFGMpGAPG V 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGL 
PGLPGPPGLPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 

PGI GG P PGE PGT j Pf? T PfSPMfl D Pf2 A T/"2 ddp d irn cv-«/-» t * r/i t\^\^ nn^i 
s w j. ^ *- *j t" vj u v» j. f v? irnu .f vaft iy j< p y p jtw-fcXjvj J.VGPQGPPG 

PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPG VPGLLGP KGE PG I PGDQGLQGP PG IPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGPPGPPGPPGPPAVMPPTPPPQGEYLPDMGLGIDGVXPPHAYG 
A KKG KNGG P A YEMPAFTAELTAPFPP VGAP VKFNKLL YNGRQN Y 
NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 
E YKKGFLDQASGSAVIiLLRPGDRV FLQMPSEQAAGLYAG QYVHS 
SFSGYLLYPM 




215 


2061 


HASNKSASLQUKMANPKEKTAMCLVNELARFNRVQPQYKLLNER 
GPAHSKMFSVQLSLGEQTWESEGSSIKKAQQAVGNKALTESTLP 
KPI*KPPKSNVNNNPGCITPTVELNGLAMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVMYNQRYHCP I PKI FYVQLTVGMNE FFG EGKT 
RQAARHNAAMKALQALQNE PIP ERS PQNGES GKDMDDDKDANKS 
EISLVFEIALKRNMPVSFEVIKESGPPHMKSFVTRVSVGEFSAB 
GEGNSKKLSKKRAATTVXiQELKKIiPPLPVVEKPK\HFFKKRPKT 
IVKAGPEYGQGMNPISRLAQI0X2AKKEKEPDYVLLSERGMPRRR 
EFVMQVKVGNEVATGTGPNKKIAKKNAAEAMU^I^YKASTNIjQ 
I3QLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(AoAlanine. C°Cvsfceine n-Aunarh^ a<-H^ t? 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
JIoHistidine, Wsoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, TVThreonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








RHKVISGTl'IiGYLSPKDMNQPSSSFFSISPTSNSSATIARELLM 
NGTSSTAEAIGLKGSS PTPPCSPVQPS KQLE YLARIQGFQVHYC 
DRQSGKECVTCliTLAPVQMTFHAIGSSIEASHDQV+YATAILLC 
YGPARKWKAIKMEAMCAHAALLSLIHYLlAPSARLEKSKLFAXiG 


5846 


1126 


456 


FSKLIKKTFI IG ISGVTNSGKTTLAKNLQKHLPNCS VI SQDDFF 
KPESEIETDKNGFLQYDVLEALNMEKMMSAISCWMESARHSVVS 
TDQ3S AEE I P I L 1 1 EGFLLFN YKPLDTI WNRS YFLT I P Y3ECKR 
RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYLDGT 
KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS /CK* IRK 
LQGVI 


5847 


2769 


505 


AP EMEDLS S PDS TLLQGGHNLLS S AS FQE S VTFKDV I VD FTQE E 
WKQLDPGQRDLFRDVTLENYTHLVSIGLQVSKPDVISQLEQGTB 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
KKRDDSWSSNLLESWBYEGSLERQQANQQTLPKEIKVTEKTIPS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCN*/CVEKAF 
SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTQHORIflTGE 
KPYKCDECX5KAFSQRTHLVQHQRIHTGEKPYTCNE0GKAFSQRG 
HFMEHQKIHTGEKPFKCDECDKTFTRSrHLTQHQKIHTGEKTYK 
CNECGKAFNGPSTFTRHHMTHTY7RVt»virrTJB , rY*v&i?o/-vuexTr mr> 

HQ KTHTGE KP YD CAE CX3KS FS YWS SLAQHLKIHTGEKP YKCNEC 

GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 

HTQEKAYECKECGKAFIRSSSLAKHERIHTGBKPYQCHECGKTF 

SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 

KPYECA3CGKAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 

HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGEKPYK 

CNECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSAZiN 
KHQRLHPGI 


5848 


22 


2961 


AAPRRIjLRGGDGDRTPRFPLPALLRPGPPAEAAPERRKMPAVSK 
GDGMRGIAVFI SDIRNCKSKEAEIKRINKELANI RSKFKGDKAL 
D3YSKKKYVCKLLFIFLLGHDIDFGHMEAVNLLSSNRYTEKQIG 
YLF I S VLVNSNSE L I RL I NNAI KNDLAS RNPTFMGLALHC I AS V 
GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDL 
VPMGDWTSRWHLLNDQHLGWTAATS h ITTLAQKNPEEFKTSV 
SLAVSRLS\RIVTSASTDLQDYTY*FCPGFLGLSVKLIiRLLQCY 
P PPDPAVRGRLTECLETILNKAQEPP KSKKVQHSNAKNAVLFEA 
I S L 1 1 HHDS E PNIiLVRACNQLGQ FLQHRETNLR YLALESMCTLA 
S S EFS HEAVKTHI ET VI NALKTERDVS VRQRAVDLL YAMCDRSN 
A?QIVAEMLSYLETADYSIREEIVLKVAILAEKYAVDYTW\YVD 
TILNLIR I AGD YVS EEVWYRVIQIVINRDDVQGYAAKTVFEALQ 
APACHENLVKVGGYI LGEFGNLI AGDPRSS PLIQFHLLHSKFHL 
CSVPTRALLLSTYIKFVNLFPBVKPTIQDVLRSDSQLRNADVEL 
QQRAVEYLRLSTVASTDILATVLEBMPPFPERES S 1 LAKLKKKK 
GPSTVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPSPSADLLG 
LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 
RFVCKNNGVLFENQLLQIGLKSEFRQNLGRMFI FYGNKTSTQFL 
NFTPTLICSDDLQPNLNrOTKPVDPTVEGGAQVQQWNlECVSD 
FTEAPVLNIQFRYGGTFQNVSVQLPITLNKFFQPTEMASQDFFQ 
RWKQLSNPQQEVQNIFKAKHPMDTEVTKAKIIGFGSALLESVDP 
NPANFVGAG 1 1 HTKTTQ IG CLLRLE PNLQAQMYRLTLRTS *<EAV 
SQRLCELLSAQF 


5849 


3545 


1895 


KRREIKETVFHHVAQAGLELLSSSNPPSSASRSAGITGMRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGIEVEES 
DEFIREI3MKYKDATNKHSHLHREDKHITIEDLWKRWKTSEVHNW 
TLEDTLQWI.I EFVELPQYEKNFRDNNVKGTTIjPRI AVHEPS FMI 
SQLKISDRSHRQKLQLKALDWLFGPLTRPPHNWMKDFILTVSI 
VI GVGG CW F AYTQNKTS KEHVAKMMKDLES LQTAEQSIiMDLQER 
LB KAQE ENRNVAVEKQNL * RKMMDE I N YAKEEACRLRE LREGAE 

celsrrqyaeqeleqvrmalkkaekefelrsswsvpdalgkwlq 

LTHEVEVQY YNI KRONAEMQLAIAKDEAEKI KKKR S TVFGTLHV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
irt-rtianme, t-uysceine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S^Serine, T«Threonine, V- Valine, 
W*TryptoDhan, Y-TVronine x=unlennum * r ,oH rtn 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDEVDHKILEAKKALSELTTCLRERLFRWQQl^KiCGFQ ' 

lAHNSGIiPSIiTSSLYSDHSWVVMPRVSIPPYPIAGGVDDLDBDT 

PPIVSQFPGTMAKPPGSLARSSSLCRSRRS1VPSSPQPQRAQLA 

PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPAIiYRNBEE 

EEAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 

IIS/DERYQEMRCP*RIPSGGIL 


' 5850 


3 


1895 


KAVLNF S ASGS VISLTGSNPMHDASMWHL KKNGI I VYLDVPLLN 
LICRLKLMKTDR1VGQNSGTSMKDLLKFRRQYYKKWYDARVFCE 
SGAS P EE VAD KVLNA I KR YQDVDS ET FI S TRHVWPBDCEQKVS A 
EFF I EAV Z EGLASDGGLFVPAKE FPKLS CG EW KS LVGAT Y VE RA 
QI LLERCI HPADIPAARLGEM I ETAYGENFACS KIAPVRHLSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
w4voAvunur orujiNiUiL/^yKIAVVAr FPENGVSDFQKAQIIGSQ 
RENGWAVGVESDFDFCQTAIKR I FNDSDFTGFLTVEYGTILSSA 
KS INWGRLLPQWYHASAYLDLVSQGFISFGSPVDVCI PTGNFG 
KI LAAVYAKMMGI P I RKFI CASNQNHVWTDFIKTG\HYDLRGKB 
N*AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHH FQ X E KAL VE KLQQDFVADWCS EGECLAA INSTYNTSG Y ILD 
PHTAVAK WADR VQDKTCP V 1 1 SS TAH YS KFAPAI MQALKI KE I 
NBTSS SQLYLLGS YNALPP LHEALLERTKQQ EKME YQ VCAADMN 
VLKSHVEQLVQNQFI 


5851 


3120 


1802 


RCYLQFLALLLTSTSARAAAAI AAAEEPAGS PS VMTRAGDHNRQ' " 
RGCCGS LADYLTSAKFLLYLGHSLSTWGDRMWH FAVSVFLVELY 
GNS L LLTAVYGL WAGS VLVLGA 1 1 GD WVDKNARLKVAQTS L W 
QNVSVI LCGI ILMMVFLHKHELLTMYHGWVLTSCYILI ITIANI 
ANLAS TATA I TI QRDW I VWAGEDR S KLANMNATI RR I DQ LTN I 
LAPMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVIiLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 

uuiiiLiiVDr a uw c ivLAjri VOX * Ny JrVJ? / LiGnHGSCFP 

LYDCPGL* LHHHRVRLHSGTEWFHPQYFDGS IS YNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISlCRSHKCRGKGWG 
SSSYPSLPALLRARSAPGHCTHRSCGPEWRIDSISRLEMQGARR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGPFITSGPG/WFRQ 
YYFFISGRH^VLFTESDFYYVAMDFGGHGLSSKYSPGVPYYLQT 
FVSB I RRWAGKKQSVYFRRCGGCSRAP PLITGGGVGSRKQRWP 
ESGAWALAPGL P AI HGRS WE S 


5853 


223 


1346 


t\ ULAlLiO K V IUs Utt\3 JfAASAW IS JJFUTKliD PGG P WGM WRGSDLRPR 

PVSLTGLTLVCK*AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLHPRRPRLHPGTRGVAVEPHALRVVHVAHGEEAGI RAAGPGH 
GGVE I PQG / VGS LGARRGLRPS RPSS RHRKR VPAP P PGRPLATP 
HRRRFP PD PALTCPGLGQDQGPRE QQKQGS GRHDT I LGDWGES E 
SRWVRGNFRTGTAATLIGFSRNPTLNGSENWGSLVSIQEEGPDT 
GWEREKRN PAEMGNPQRWAS P I HTP PLG PE I LRAM PEALRAM PE 
ALGLRPDPATSVPSALS /OTP / PESWpr ^rr.PKrnfwrr /avir duo 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86" 


938 


KGRNTAPEKKGAALNNRENASS *NGY/SRMKQDIRRIENHI IQE 
LKHLCAMIKRVLLERLENTRKLRELTEGRTLDWPQNRITEVSAK 
RQIVTEYREKGKRN*EEKKRDLEGRSRRYNLCIIGIPETEDRAS 
GAET1 KDIiLE/ENFPELKNELDLQMEKAHRI PLKFNEKKAASRH 
IRVTFL/KFQRRNILQASSQRKQVTYKGAKVRLTSDFSPAILNA 
RRQ W/N/ P I SRVLRENNFEPR 1 1 YSAKLS FLY KGNWKTFLD IQG 
LGKYINQELS LK ILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLLPSLLMGYSBSPPPITDSWAP 
FISLTHHVLSQSQSPLSSNCWICLSTHTQ*FTALPADLLTHTQS 
NVSLHISYLAIPFLADSFtiKPV/L*PGKSAKHLSFKLSSLSMVS 
GRAVALLHLIASGLTSIQTNTASSKPPIWGY\LSTQTSFISPPP 
LCLS RT YPN PAHATMVGQVPQS LCGLX FTL /RTP CRPS I LH PNY 
KIISTSAWQKVLCFSGSPTXHTSLHLTTGSSFLSFHPIPGFPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT*QPPHRGSN/RLTVDKDN 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide ' 
\n-maiune, c-L.ysceme, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Aeparagine, 
P=Proline, Q«Glutamine , R-Arginine, 
S-Serine, T»Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNSLHQLPSQ\TPYQALTtiAAlAGSYPlWENENTLSWlH 
PTPTYNFCLSTPSLF FLCDTN* YLCLPANWSGTCTLVPQAPTI N 
ILPPNQTILISVEAS I SSSPI RNKWALHLI TLLTGLG 1 TAALGT 
GIAGITTSITSYQTLFTTLSNTVEDMHTSITSLQRQLDFLVGVI 
LQNWRVLDLIiTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL*HQVADSWWQGSSLLRWIPWVAPFLGPLIFLFLLLMIGP 
C I FNLVSR F I S QRLNC F I QAS M QKH I DN I FH LCHV* YQS LRGNH 
SEAPEPRP 


5656 


173 


1137 


PWLHGLGLSAVFLFYL*/YVTFHLYGGIILLLLIFISIAGILYK 
FQD VLL YFP EQ PS S S RL YVPM PTG I PHENI FI RTKDG I RLNLI I» 
IRYTGDNS PYS PTI IYFHGNAGNIGHRLPNALLMLVNLKVNIiLL 
VD YRG YG KSEG EAS E BG L YLD5 EAVLDY VMTS PDLDKTKI YLSG 
R5LG\GAAAlHbASDNSHRISAIMVENTFLSIPHMASTLFSPFP 
MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPVMMKQ 
LYELS PSRTKRLAI FPDGTHNDTWQCQGYFTALEQFI KEWKSH 
SPEEMAKTSSNVTI I 


5857 
5856 


1597 


563 


kligkvlvlswaeamaafavbpqgpalgsepmmlgsptspkpg 

VNAQFLPGFLMGDLPAP VTPQP RS I SGPS VG VMEMRS PLLAGGS 
PPQPWPAHKDKSOAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 

gdsi*tsedh\lddswgdciwgflkasa\syill\qfaqyggis* 
kmwmsntgnwmhi ryqs klqarkalskdgri fges1kigvkpci 
dksvmessdrcalsspslaftppiktlgtptqpgstpristmrp 
lataykastsdyqvisdrqtpkkdeslvskameymfgw 




553 


1419 


»«x«vnrw*t3 i *'i'*'-fWWSiSiU^ V VAyGPGPAPGVGSAP 

PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPP PGGPGGRS EEKI SG PRRG FXAN LS LLRRPGEKTYTQRCRFC 
LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 
ALA+NCPKFELG*YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 
MFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 
FAS 


5B59 


307 


1503 


GGS SAR PRAS S RRM LSR KXTKNE VS KPAE VQGKYVKKETS PLLR 

NLMPSFIRHGPTI P^RTOTCT.Pn^QDMIi VCTCrnrinrcnMnti r-rr 

RTPIQRTPHEIMRR3SNRLSAPSYLARSIADVPREYGSSQSFVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDL FQRM PQNQGRHASG I GR VAATS LGN LTNHGS EDLPL P PGWS 
VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGT 
YYVDHTNKKAQY\RHPCAPTCTSV*ST?SCHI/AS/RQQTERNQ 
S LLVPANP YHTAE I PD WLQVYARAPVKYDHI LKWE LFQLADLDT 
YQGMLKL L FM KE LE Q I VKM YEA YRQALLTE L ENRKQR QQ W Y AQ Q 
HGKNF 


5860 


2956 


1270 


tirveefplcpgggkaqlssaslusaglllqpptppp^lllCfp 
lllfs rlcgalag p 1 i vephvtavwgknvs lkcl i e vneti tq i 

S WEKI HG KS S QT VAVHHPQYG FS VQGE YQGR VLFKNYS LNDA7 1 

TLHNIGFSDSGKYICKAVTFPIiGNAQSSTTVTVLVEPTVSLIKG 
PDS L I DGGNETVAAI C I AATO K P VAH t nwrnnr jzwmt? c t>tt «*t% 

NETATI I SQ YKLFPTRFARGRR I TCWKHPALEKDIRYSFI LDI 
QYAPEVSVTGYDGNWFVGRKGVNLKCNADANPPPFKSVWSRLDG 
QW PDGLLAS DNTLHFVH PLTFN Y SGVY I C KVT \ NS PGS KE VTQK 
VHP'rFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP*KIA 
PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIP 
CYRRRRTFRGD YFAKN Y I PPSDMQKESQ IDVLQQDELDPYPDS V 
KKENKNPVNNLIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAF^VASSGDDSQGGDKCGCEVGSWVGS^VVMA^nr - 
SEGEQGIPTACAAFAQQPAG/BPRRGLAGVGEGGPOCSWVNYRC 
TLEFL VS LLGTDLARGRGNS ASG PTAPADS KQL/ML * DVHRRVI 
LE + RMNSGSPARDNAPSQRFCTNLSEGLRFG IS PSWREALYGCH 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid. F= Phenvl alaninp n-nivr-inp 
H»Histidine, I-Isolcucine, K«Lysine, 
L°Leucine, M=Methionine , N=Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine # V»Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








A 


5862 


1S56 


483 


PP FOL I MGE I KVS PDYNW FRGTVPI ,kkt t vrm nnc irTucT vtS ivn 
PRSIRCPLIFLPPVSGTADVFFRQILALTGWGYRVIALQYPVYW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVHSLILCNSFSDTS1 FNQTWTANS FWLMPAFMLKKIVLGNFSS 
GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 
RD I P VT IMDV FDQS ALS TEAKEEM Y KI/YPNARRAHLKTGGNF P Y 
LCRSAE VNLYVQ IHL/R/RNSME PNTR PLTHQWS VPRS LRCR KA 

ALAS ARRSSS VS LAVNDELTRCVLV* S VASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSbPIiAAPREDTMGPLMVLFCLLFLYPGLADSAPSCPQN 
VN I SGGTFTLS HGWAPGS LLTYS CPQG L YPS PAS RLCKS SGQWQ 
TPGATRSLSKAVCKP VRCPAP VS FENG I YTPRLGS YPVGGNVSF 
ECBDGF I \LRGS PVRQCR PNGMWDGETAVCDNGAGHCPNPG I Sh 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 
PICRQPYSYDFPEDVAPALGTSFSHMLGATNPTQKTKESLGRKI 
QIQRSGHLNLYLLLDCSQS VS ENDFLI FKESASLMVDRIFS FEI 
NVSVAI ITFASEPKVLMSVLNDNSRDMTEVISS LENAN YKDHSN 
GTGTNTYAALNS VYLMMNNQM RLLGMETMAW\QE I RHAI I LL\T 
DGK\SHMGGS PKTAVDH I RE I LNINQKRNDYLDI YAIG VGKLDV 
DWRELNELGS KKDGERHAF I LQDT KALHQ VFEHMLD VS KLTDTI 
vaj vuHnoMiVtoLiycK X rWrtv 1 lAFKiyjil \C \RGAIjISDQWVIjT 

aahcfrdgndhslwrvnvgdpksqwgkefliekavispgfdvfa 
kknqgil\efygd\diall\kiaqkvkm\sthcqgpsclp\ctm 
\eanlgflretfkgstcr\dhenel/vxvnkqsv\pahf\val\n 

GSKLEHLTLRMGVEWTSCCRGLSPKKKTM\FP^fLT\DVRB\WT 
D\QFL\CS\GPQEDESP\CK*E\SGGA\VFLEKRFJ«LSAGGVWC 
SWGL\YNP\CT,GSA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 

0*SPWIiROHPGRMS*TT?T,OT.T.lVM'rtlIT.Cdrivr , T3»r>T/^onT tl-ct no 
W o r niiAunruurio ± c JL> r LiLu\r4\jnlJ& Jr rALFAKl L.Kir.LiHc L>PS 

EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPLIiCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPS I PPSS PLACVLKNLKPLQLTPDLKPKClal FFCNTAWPQY 
KLDNDS K* PENGTFEFS I LQ VLDNS CHKMGKWS E VPDVQAFF\ S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWQEGWRAGHTIVGCIPPKTAT -QWTrvrrMVT n/rMr^p 
LSVCVCVQVGSWICV/CVSMCACVSLC?C\ICRCISMYTREHAC 
ACTRV*VYMCMS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/WYVCVLC^WACMRMSTCVWLVYG*ACTCVWMHM/CSCTCR/C 
VHVCCMSMHACECLCVYLH ICGCAGTRRWWAGSARGSRS CS RLP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
GSPAAVCSRNCTVSPRRGADCFEAPDVPKQPPGWGRASFEERGC 
GGRGWVCAPPLKGPQCCCFSI KPELKAKKKK 


5866 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKKNKGKERRDLDDL"" 

KKEVAMTEHKMSVEEVCRKYNTDCVQGLTHSKAQEILARDGPNA 

LTP P PTT P EW VKFCRQLFGGF S I LLW IGA I LCFLAYGI QAGTED 

DPSGDNLYLGIVLAAWI ITGCFS YYQEAKSSKIMESFKNMVPQ 

QALVIREGEKMQVNAEEVWGDLVEIKGGDRVPADLRIISAHGC 

KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 

RGVWATGDRrVMGRIATLASGliEVGKTPIAIEIEHFIQLITGV 

AVFIX3VS FFI LSLI LGYTWLE AVI FLIGI ivanvpegllatvtv 

CIiTLTAKRMAR KN Cb V KNLEAVE TLGS TS T I CSDKTGTLTQNRM 

TVAHKWFDNQIHEADTTEDQSGTSFDKSSHTWVALF*H/LI J GFC 

NRPVFKGGQDNIPVLKRDVAGDASESALLKCIELSSGSViCLMRE 

RNKKVAEIPFNSTNKYQLSIHBTEDPNDNRYLLVMKGAPERILD 

RCSTILLQGKEQPLDEEMKEAFQNAYLBLGGLGERVLGFCHYYL 

PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 

KCRSAGIKVIMVTGDHPITAKAIAKGVGI IFEGNETVEDIAARL 
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ID 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing 3ignal peptide 
(A= Alanine, C=cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, KsLysine, 
J.=Leucina, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutaraine, R*>Arginine, 
S=Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








iuf vjyvw rRUHAAL v a. nvj luuejjiria a y i u k x LQNHTEI V PAR 
TS PQQKLI I VEG CQRQGA I VAVTGDG VNDS PALKKAD IGVAMG I 
AGS DVS KQAADM I LLDDNFAS IVTGVBEGRIjIFDNLKKSIAYTL 
TSNIPEITPPLLPIMANIPLPLGTITILCIDU3TDMVPAISIAY 
EAAESDIMKRQPRNPRTDKLVNERIjISMAYGOIGMIOALGGPFS 
YFVILAENGFLPGNLVGXRLNWDDRTVNDLEDSYGQQWTYEQRK 
VVEFTCHTAFFVSIVVVOWADLIICKTRHNSVFQOGMKNKILIF 
GLFSETALAAFIiSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDEIRKLILRRNPGGWVEKETYY 


5867 


3 


148$ 


LPGRRARGGRGLGWPPAQAIaDGSRMGKAKVPASKRAPSSPVAKP 
GPVKTLTRKKNKKKKRFWKSKAREVSKKPASGPGAWRPPKAPE 
DFSQNWKAIiQEWLLKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDD VDPAD I E AAIG P E AAKI ARKQLGQS EGS VS LS LVKEQAFG 
GLTRALALDCEMVGVGPXGEESMAAR VS I VNQYGKCVYDKYVKP 
TE P VTDYRTAVSGIRPENLKQGEELE WQKEVAEMLKGRI LVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
I LG LQ VQQAEH CS IQDAQAAMRL YVM VKKEWESMARDRRP LLTA 

PDHCSDDA+QSCPAAAAAPLQRQCDQSQGQITSPQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTifAMREDLADIWYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTPWI^AEDYPLLLGSADLGVCLHTSSSGLDLPMKVVDMFG 
CCL P VCAVNFKCLHELVKHEENGLVFEDSBELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDAS QS TSAKYP AAAQNL / C VTNAMREDLAD I W Y IR ' 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV* T\MTLDGHNLPSLVCVITGKGPLRBY YSRLIHQKHFQH 
IQVCTPWLFJu^DYPLLLGSADLGVCLHTSSSGLDLPMKVVDMFG 
f v la v WFKCJjHEIiVKHE ENGL VFE DS EE LAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQIjRWDESWVQTVtPLVMDT 


5*70 


2122 


833 


ltagashtqdasqstsakypaaaqnl/ottnamredu^diwyir 

AVTVYDKPASFFKETPLDLQHRLFMKLGSMHS PFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV* T\MTLDGHNLPSLVCVITGKGPLREY YSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 


fffcrplrlyskttgdrsamagaagltaevsMkvlerrartkrs 
vlkll*lslrrl*lepti*ngllt*csrlsvfrflkv\gsvyep 
lks inlprpdnetlwdkldhyyri vkstlllyqs pttglfptkt 
cggdqkakiqdslycaagawalalayrridddkgrthelehsai 

KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YEEYGHLQINAVS LYLLYLVEM I S5GLQI I YNTDEVS FIQNkVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L*KQFNGFNLFGNG^CSWSVIFVDLDAHNRNRQTLCSLLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSQTLDKVVRKLKGKYGFKR 
FLRDGYRTSLEDPNRCYYKPAEIKLFDGIECEFPIFFLYMMIDG 
V FRGNP KQ VQE YQDLLTP VLHHTTEG Y P WPKYY YVPADF VE YE 
KNNPGSQKRFPSNCGRDGKLFLWGQALYI I AKLLADEL IS PKDI 
D P VQRY VP LKDQRNVS MRFS NQG P LENDLWHVAL I AE S QRLQ V 
FLNTYG1QTQTPQQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 
PDRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLLID 
DIKNALQFIKQYWKMHGRPLFLVLIREDNIRGSRFNPILDMLAA 
LKKG I IGGVKVHVDRLQTLISGAWEQLDFLRI SDTEELPEFKS 
FEELEPPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHBIL 
QKLNDCSCLASQAIIXGILLKREGPNFITKEGTVSDHIERVYRR 
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SEQ 
NO: 


" Predicted 
beg i nmng 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing 3igna!i peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid", B* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
L= Leucine, MsMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, ^Threonine, V=*Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSQKLWS WRRAASLLS KWDS LAPS ITtWL\>QGKQVTLGAFG~" 
KEEEVISNPLS PHVIQNI I YY KCNTHDEREAVIQQE L V I HI GW I 
I S NNPELFSGTLKI R I GW I IHAME YE LQI RGGDKPALDLYQLS P 
S E VKQLLLD I LQPQQNGRCWLNRRQ I DGSLNRTPTGF YDR VWQ I 
L»bKi frJ^ji x VAGKHIjPQQ PTLS DMTM YEMNFS LLVEDTLGNI DQ 
PQ YRQ I WB LLMWS I VLERN p ELE FQDKVDLDRLVKEAFNEFQ 
KDQSRLKEX EKQDDMTS F YNTP PLGKRGTCS YLTKAVMNLLLEG 
EVKPNNDDPCltIS 


5872 


68 


665 


VQGYMYRFVIKINSCYSEKTSICRHRCCPELPATQPWPTPTVFF 
NI AIDS ESLGCI \SFKLFADKV/ PKRWKKNPVLLNTGEKVLiGDK 
GPCFYRIIPgVLCQGGDFTHHNGTGGKSLYSKEFDDENFI/IjKH 
TAPGYLSTANAG PTTNGS Q FF I CTAKTEDG * QHWFGKVKDGMS 
IVEAIiERSGSRNGKTSKKITAANCGQL 


5873 


2240 


S 506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLS^TVAGGFGNAASAR " 
HHGLLASARQPGVCHYGTKLACCYGNRRNSKGVCEATCEPGCKF 
GEC VG PNKCRCFPGYTGKTCSQD VNE CGM KPRP CQHRCVNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCIiCP 
S SGIiRLAPNGRDCLDIDECASGKVI CP YNRRCVNTFGS YYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENS VKE VLRAPGTI KDRI KKLIiAHKNSMK 
KKAKI KNVTPE PTRTPTP KVNLQP FNYEEIVSRGGNSHGG\ KKG 
NEEKMKEGLEDEKREEKAliKD*HRRERPFRG\DVFFPKVNEAGE 
FGLI h \ VQRKALTS KLEHKADLNI SVDCSFNHG \ I CDW \ KQDR\ 
EDD FDW \ NPADR \ DNA I \G FY\MAVPGL WQGH K\ KD IGRLKLLL 
PDLQPQSNFCLLFDYRLAGDKVGKLRVFVKNSNNAIjAWEKTTSE 
DEKWKTGKIQLYQGTDATKSIIFEAERGKGKTGEIAVDGVLLVS 
GLCPDS LLSVDD 


58^4— 


2 


3387 


ACPRLARRRRRVRSLRRRRGWLRARWSRGQNKMAARRITQETFD 
AVLQ EKAKRYHMDASGEAVS ETLQFKAQDLLRAVPRSRAEM YD D 
VHSDGRYSLSGSVAHSRDAGRESLRSDVPSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKI/3HFRSQDWKFALRGSW 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECLEKE\ 
SRDYD VDHS G\EA\DS VLRG S \ SQVQA\ RGRALN I VDQEGS LLG 
. KGETQGLLTAKGG VGKLVTLRNVSTKKI PTVNRI TPKTQGTNQ I 
QKNTPSPDVTLGTNPGTEDIQFPIQKXPXiGLDLKNLRLPRRKMS 
FDIIDKSDVFSRFGIEIIKWAGFHTIKDDIKFSQLFQTLFELET 
ETCAKMLASFKCSLKPEHRDFCFFTIKFLKHSALKTPRVDNEFL 
NMLLD KGAVKTKNCFFE 1 1 K P FDKY I M R LQDRLLKS VTPLLMAC 
NAYELSVIO^KTLSNPLDLAUUiSTTNSLCRKSLALLGQTFSLAS 
S FRQEKIL * AVGLQDI APS PAAFPNFEDSTLFGREY IDHLKAWL 
VSSGCPLQVKKAEPEPMREBEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVIBGSLSPKERTLLKEDPAYWFLSDEN 
SLEYKYYKlJCIiAEMQRMSENLRGADQKPTSADCAVRAI'IIiYSRAV 
RNLKKKLL P\ WQRRGLLRAQG \ LRG \ W KARRA\ TTGTQTLLFLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVDIS EAPQTSS PCPSADIDMKDNGRTAE 
KLARFVAQVG\PEIEQF\Sl\ENSTDNPDLWFIi\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
GPS LEGST P ADGL PGEA\ AEDDL/ ALGAPALFTGLLQVTCFP FG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKDLDFAQQKIi\TDK\NLGFQ\MLQKMGWKEGHGLGSLGK 

GIR\ SRSACTOOAAWGG *5GWr3Ti<! PQTfQ T,DT.f3 CBTlMfMa VCWnr 
IFVF 


5875 


296 


1848 

— j. 


IiAALGGLPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA ' 
LEFSG^LFPHAICLGDVDNDTLNELVVGDTSGKVSVYKNDDSRP 
WLTCSCQGMIjTCVGVGDVCNKG KNLLVAVSAEGWFHL FDLT PAK 
VLDASGHHETLIGEEQRPVFKQH I PANTKVMLI SDIDGDGCREL 
WG YTDRWRAFR WE ELG EG PEHLTGQLVS LKK WML EG Q VDS LS 
VTLGPLGLPELMVSQ PGCAYAI LLCTWKKDTGS PPASEGPTDGS 
/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A^Alanine, C^Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P«Proline, QsGlutamine, R^Arginine, 
S»Se rine, T^Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /oposeible nucleotide deletion, 
\apo5sible nucleotide insertion? 



GLFALCTLDGTLKLMBEMEEADKLLWSVQVDHQLFALEKLDVTG" 
NGHE EWACAWDGQT Y 1 1 DHNRTVVRFQVDENIRAFCAGLYACK 
EGRNSPCLVYVTFNQKIYVYWEVQLERMESTNLVKLLETKP\ST 
TACCRSWAWILTTSL*LVPCPTKRSTIQTSHHSVLPQASRIPPS 
WTCLIAGEGPF»TPTLPPKGVFGSHCAAAG3ITKQ 



224 



"587T" 



2030 



1907 



HLPLGVPS KVAUAAAME PQEERETQVAAWLKKI FGDHP I ?Q YE V 
K PRTTE I LHHLS ERNR VRDRD VYLVIE DLKQKAS EYES EAKYLQ 
DLLMESVNFS PANLS S TGSRYLNALVDSAVALETKDTS LAS FI P 
AVNDLTSDLFRTKSKSBEIKIELEKLEKNLTATLVLEKCLQEDV 
KKAELHLSTER \ AKVDNRRQNM\ D FLKAKS EE FRFG I QAAGEQL 
SARGQ\DAFS VP IQSLVALIRBNWPRLKQQTI PLK\ KKLES YLD 
LMP\nPSHCSK*RIEEAK\RELA\5IEAELTRRVS\MMEIi 



5878 



■950" 



2113 



GTLGKMAASSSGEKEKERLGGGLGVAGGNSTRERLIiSALEDLEV 
I^RELIE>IIAISRNQKLLO^EENQVLELLlHPJ)GEFQELMKLA 

lnqgkihhbmqvlekevekrdsdiqqlqkqlkeaeqilatavyq 
akeklksiekarkgaisseeiikyahrisasnavcapltwvpgd 
prrpyptdlemrsgllgqmnnpstngvnghlpgdala/rrkiar 
cpcstvs/ngsqmtcr*iniii,ilqksvcel 



981 



5880 



1138 



1324 



5881 



26 



5882 



2407 



glwkcmqlquphthrvqp*ptprqqgpq\vpvaviagnrpnyly 
rmlrsllsaqgvspqmitvfidgyyeepmdwalfglrgiqhtp 
isiknarvsqhykasiitatfnlfpeakfawleedldiavdffs 
flsqs ihlleeddslyci sawndqgyehtaedpallyrvetmpg 

LGWVLRRSLYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECXI 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQIiRNVDSL 
KKEAYEVEVHRLLSEABVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MEKDDDFTTWTQLAKCLHIWDLDVRGNHRGLWRLFRKKNHFLVV 
GVPASPYSVKKPP5VTPIFLEPPPKEEGAPGAPEQT 



RLTEAAAAGSG SRAAGWAGS PPTLLPLS PTS PRCAATMASSDED 
GTNGGAS EAGE DRE APGKRRR US FLATAWLT FYD I AMTAG WL VL 
AIAMVR F YMEKGTHRGLYKS I QKTLK FFQTPALLE I VHCL 1 G I V 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESWLFLVAWTVT 
EITRYSF YTFSLLDHLPYFI KWARYNFFI ILYPVGVAGELLTI Y 
AALPHVKKTGMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRRKVLHG \G * L * KRM I K * S LQTRCFFQNNQD YLS PS F 
NNKNKQLCEISWIVWFLKI 



441 



2216 



S LWCLVAGGLGLG P S SQN PLQRAG I LAR PKEARGT FS ALTACS A 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVIiSD 
*KIO<RGRCSS/WLSQPQHEREKEVVLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQSEHTDGHTSVQSVIEKLQEENRLLKQKVTHVEDLNAKWQRYN 
ASRDEYVRGLHAQLRGLQ1PHEPELMRKEISRLNRQLEEKINDC 
ASVKQE LAASRTARDAALER VQMLEQQ I LAYKDDFMS ERADRER 
AQSR I QELEE KVAS LLHQ VS WRQDS REPDAGR IHAGS KTAKYLA 
ADALELMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEELLRHVAECCQ 



GGIHPSPTEAFRAQHLTMDCTWRXLFL VAAATGTHAQVQLLQSG 

SEVKKPGASVMVSCYVSGYTLTKLSMm*VRQAPGKGLE*MGPFD 

LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 



SGCVEMLYSHSLEYNPEWISVQSAVAP AQLALNSDGDL*LHSGE 
RTRRD*QliPSAGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
HYSKQVELELQQIEQKSIRDYIQESENIASLHNQITACDAVLER 
MEQMLGAFQS DLSS I S SBI RTLQE QSGAMMIRLRNRQAVRG KLG 
ELVDGLWPSALVTAILEAPVTEPRFLEQLQELDAKAAAVREQE 
ARGTAACADVRGVL DR LR VKAVTK I RE F ILQKTYS FR KPMTNYQ 
I POTALLKYRFFYQFLLGNERATAKEIRDEYVETLSKI YLS YYR 
SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTIF 
TLGTRGS V I S PTELEAP I L VPHTAQRG EQR YPFEAL FRSQH YAL 
LDNS CRE YLFICEFFWSGPAAHDLFHAVMGRTLSMTLKHLDS Y 
IADCYDAIAVFLCIHIVLRFRNIAAKRDVPALDRYWEQVLALLW 
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sect 

ID 

NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - " 
(A=Alanine, C=Cysteine. DsAsnari-ir aoSh p 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, Methionine , N-Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S»Serine, T*Threonine, VsValine, 
{^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


5 883 






PRFELI LEMN VCIS ^KSTUPQRLGGLDTRPHYITRKVAEPSSALV 
SINQTIPNERTMOLLGQLQVEVENPVLRVAAEFSSRKEQLVPLI 
NN YDMMLG VLM \ E * ERAADDS KEVB S FQ QLLNARTQEFI EELL S 

PPFGGLVAFVKEAEALIERGfQAERLRGEEARVTQLIRGFGSSWK 
SSVESLSQDVMRSFTNFRNGTSIIQGALTQLIO\liYHRFHRV\I* 
3QPQLRALPARAELINIHHLMVELKKHKPNF 


5884 


2 


1374 


EFPGRRFRAVMEAGAGAGAGAAGWSCPGPGPWTTLGSYEASEG 
CERKKGQRWGSLERRGMOAMEGEVLLPALYEEEEEEEBEBEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEFLRAOVTnT vm?t 

EETRELAGQHEDDSLELQGiLSDERLASAQQAEVFTKQIQQLQG 
ELRSLREEtSLLEHEKESELKEIEQELHLAQAEIQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSSEPSGS LGLSD YSGLQE ELQELRER YHFLNE E YRA LQES NS 
SLTGQLADLESERTQRATERWLQSQTLSMTSAESQTSEMDPLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 

EORRLQRELKCAQNEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWISALLWCWWAETSS 


58B5 


4261 


2522 


GVIARASARi.KVPbTGVRACAKPEVGAEPAKVAGAAEPDEDGGR" 
SRLRDCGDYTPSERLGPKGAMLWFQGAI PAAI ATAKRSGAVF VV 
FVAG DDEQS TQMAAS WEDDKVTEAS S NS FVA I K IDTKS EACLQF 
SQ I YPWCVPSS FFIGDSGI PLEVIAGSVSADELVTRIHKVRQM 

HLLXSETSVANGSQSESSVSTPSASFEPNNTCENSQSRNAELCE 
I PSTSOTKSDTATGGESAGHATSSOEPSGC»3nnpt>&t?nT wtduc 

RLTKKr,EERREEKRKEEEQRElKKEIERRKTGKEMLDYKRKQEB 
ELTKRMLE ERNREKAEDRAARER I KQQ IALDRAERAARFAKTKE 
E VE AAKAAALLAKQAEME VXRES YARER S T VAR IQFRL PDGSS F 
TNQFPSDAPLEEARQFAAQTVGNTYGNFS LATMPPR d f rri^mv 
KKKLLDL ELAPSASWLLP / AL F INF * AGRPTAS I VHS SSG DI W 
TliLGTVLYP FLAI WRL I SNFLFSNPPPTQTS VR VTSS EPPNPAS 

SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 


5880" 


900 


467 


aagggrrsrlsrs^ptgpskspsgvrccg\rr\awedkdefldv 

IYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 
YLQ I DEEEYGGT WELTKEGFMTSFA/ 1 VHGHLDHLLHCH PL * LM 
VYSSQVLPIQSKGPS 


508? - 


86 


1341 


PFRGRALTLKiCQPRPGVAPPSLGTCHKSDPGkPAAQSQPPSPGS 
GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFET.lfTQPT ddt v\m 

E VL LE ALFLTVD P YMRVAAKRLKEGDTMMGQQVAKVVE S KNVAL 
PKGTIVIiASPGWTTHSISDGKDLEKLLTEWPDTIPLSLALGrVG 
MPGLTAYFGLLE I CGVKGGETVMVNAAAGAVGS WGQIAKLKGC 
KWGAVGS DE KVAYLQKLG FDWFNYKTVES LEETLKKAS PDG Y 
DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
P PE I G I YQ ELRMEAF WYRWQGDARQKAIjKD LLKWVIjELP YFV I 

D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKT I VKA 




1937 


104 

: 

] 
I 

: 


AFyC'KGCRATRCPCRGPRWDSLGDEAARS PAAPGGAPGLLGLRE^ 

R PDRCHPGG DDRG PQLHRGS PG / S PS ELS RRPG P PGL PGLQGP P 

PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 

ACSVPWTGDSQFCSQKAVIYSLNFTANPPQRVFELVDQINPSI 

FCIHITN\*Nt*HYPLLIQKYL/NENNFDTLMKTSDGFTLNAESY 

VSFTTKLDIPTAAKYEYGVPLQTSDSFLRFPSSLTSSLCTDNNP 

AAFLVNQAVKCTRK I NLEQCE E I EALSMAF YSSPE I LRVPDSRK 

KVPITVQSIVIQSLNKTLTRREDTDVLQPTLVNAGHFSLCVNW 
LEVKYSLTYTDAG8VTKADLSFVLGTVSSVWPLQQKFEIHFLQ 
SNTQPVPLSGNPGYWGLPLAAGFQPHKGSGIIQTTNRYGQLTI 
LjHSTTEQDCIiALEG VRTP VL FG YTMQSG C K LRLTGAL PCQLYAQ 
CVKSLLWGQGFPDYVAPFGNSQGP/ ADMLDWVPIHFITQS FNRK 
)S CQLPGALVT E VKVfTKYGS LLNPQAKI VNVTANL 1SSS F PEAN 
3GNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 

iiuci cul JLOe 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Aeparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S-Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, ' 
\=possible nucleotide insertion) 


5888 


375 


2302 


LbCRTPGVAMQRADSEQPStoPRCDDSPRTPSNTPSAEADWSPG^ 

LELHPDYKTWGPEQVCSFLRRGGFEEPVLLKNIRENEITGALLP 

CLDESRFENLGVSSXXSERKKUiSYIQRLVQIHVDTMKVINDPIH 

GH I ELHPLLVR I IDTPQ FQRLR Y I KQLGGG Y YVFPG ASHNRFEH 

SIiGVGYLAGCLVHALGEKQPELQISERDVLCVQIAGLCHDLGHG 

PFSHMFDGRFIPIiARPEVKWTHEGjGSVMMFEHLINSNGIKPVME 

Q YGL ipeedi CFI KEQ I vgples pvedslwp ykgrpenks flye 

IVSNKRNG1DVDKWDYFARDCHHLGIQNNFDYKRFIKFARVCEV 
DNELRICAl^DKEVGNLYDMFHTRNSLHRRAYQHKVGNIIDTMIT 
DAFLKADD Y I E I TGAGGKKYR I S TAI DDMBA YTKLTDNI FL EI L 
YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQ1KIKREDYES 
LP KE VAS AKP KVLLD VKLKAEDF I VDV INMD YGMQBKN P I D HVS 
FYCKTAPNRA I R I TKNQ VS QLL P \ E KFAEQ\ L I RVYCKKVDRKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKIPTRLPRRLPKSRV\QLFKDDPM 


5889 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRPLDPYPPQVFPPRPDRVAI 
VTGGTDGIGYS TAKHLARLGKHVI I AGNNDS KAKQ WSKI KEET 
LNDKET * VLL CCPG WLCLWNS SDP P7S AS RGAGTTG VHHH FL»LK 
FGI FI L \ DLASMTS IRQFVQKFKMKKI PLHVL INNAGVMMVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLIjDTIiKESGSPGHSARWTVS 
SATHY VAELNMDDLQS S AC YS PHAAYAQS KLALVLFTYHLQ RLL 
AAEGSHVTANWDPGWNTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAWTS I YAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLQQQ 
LWSKSCEMTGVLDVTL 


5890 

i 

1 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS 
GRPDPR PAS AAGGHAGE RM S QRDTLVHLFAGGCGGT VGAI LT CP 
LEWKTRLQSSSVTLYISEVQLNTMAGASVNRWSPGPLHCLKV 
I LE KEG PRS L FRGLG PNLVGVAPS RAI YFAAYSNC KE KLND V FD 
PDS TQVHM I SAAMAG FTAI TATNP I WL I KTRLQL * /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMENDEESVKEASDFVGMMLAAATS K\LVATT I 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I p \NTAIMMAT YELWYLbNG 


5891 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSFRRPRGAVRLQSGTEAACRS - 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAIT.TCP 
LEWKTRLQSSSVTLYISEVQLNTMAGASVNRWS PGPLHCLKV 
I LE KEG PRS IjFRGLGPNLVGVAPSRAI YFAAYSNCKEKLNDVFD 

pdstqvhm i s aamagftai tatnp i wl i ktrlql * /sqgtag kr 
rmgafecvrkvyqtdglkgfyrgmsasyagisetvihfviyesi 
kqklleyktastmendeesvkeasdfvgmmlaaatsk\lvatti 
ayphewrtrlreegtkyrsffqtlsllvqebgygslyrgltth 

LVRQI P NnTAIMMATYELWYLLNG 


5892 


1764 


379 


wlrvcgrlsvksavssrtggwsagltcamqrlqwlghlrgpa^ 
dsgwmpqaapclsgaphasaadvvvvhgrrtaicragrggf'kdt 
tpdellsavmtavlkdvnlrpeqlgdicvgnvlqpgagaimari 
aqflsd I pet vplstvnrqcs sglqavas iaggi rngs ydigma 

CGVESMSLADRGNPGNITSRLMEKEKARDCLrPMGITSENVAER 

fgisrekqdtfalasqqkaaraqskgcfqaeivpvtttvhddkg 

TKRSITVTQDEGIRPSTTMEGLAJOjKPAFKKDGSTTAGNSSQVS 

dgaaaillarrskaeelglp I LG vlrs yawgvppdimgigpay 
aipvalqkagltvsdvdifeine\afasqaaycveklrlpp*eg 
*tplggasgp*ghplglhwghvqvitlaq*s*sargkrayrsgc 
pcaigswngsplpvfeypwgt 


" 5893 


3 - 


1S53 


I LS KRRCOKAKTKSTiMAKKVZkVTrt "b. ftU^'nf Tct v^mm'upr ittW""" 

cfertediggvwrfkenvedgras I YQS wtntskemscfsdfp 
mpedfpnflhnsklleyfrifakkfdllkyiqfqttvlsvrkcp 
dfsssgqwkwtqsngkeqsavfdavmvcsghhilphiplksfp 
gmerfkgqyfhsrqykhpdgfegkrilvigmgnlgsdiavelsk 
naaqvfistrhgtwvmsrisedgypwdsvfhtrfrsmlrnvlpr 
tavkwmibqqmnrwfnhenyglepqnkyimkepvlnddvpsrll 

CGAI KVKST VKELTETS AI FEDGTVEENIDVI I FATGYS FS FPF 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
arr.ino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalar.ine, G=Glycine, 
H^Histidine, I=Isoleucine, X=Lysine, 
LtnLetJcine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

f?=Sprin^ TaTKrennino ^r_t t ^ _ _ 

u-ijc^iire) j. =* xiij. Buii^iic | vsivaijLne, 
W=Tryptophan, YoTyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSLVKVENN^SLYKYIFPAHLDKSTLACIGLIQPLG'gYpW - 
AELQARWVTRVFKGLCSLPSERTMMMDIIKRNEKRIDLFGESQS 
QTLQTNYVDYLDELALEIGAKPDFCSLLFKDPKLAVRLYFGPCN 
SY*YRLVGPGQWEGARNAIFTQKQRILKPLKTRALKDSSNFSVS 
FLLKI LGLLAVWAFF \ CQLQWS 


5894 


174 


1673 


RYS PKKVLQNKESSLKLGMATALVSAHSIiAPLNLKKEGLRWRE 
DHYSTWEQGFKLQGNSKGLGQEPLCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLIILPKELQARVQEH 
nva&tuujv v v vuctuLijhuiJ^h lXaQQ VDPDQPKKQKILVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIWTDSCGRVES 
SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVLS*ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 
I HTGE KP YIjC I HCG KNFRRSSHLNRHQR I HSQEE PCECKECG KT 
FS QALLLTHHQRIHSHS KS HQCNECG KAFS LTS DL I RHHR I HTG 
EKPFKCNICQKAFRLNSHIiAQHyRlHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


" 5895 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
KRLFVSDG\TGCLP VLAAAGRARGRAE VL I S 7VG PEDCWP FLT 
RPKVPVLQLDSGNYLFSTSAICRYFFMjLSGHEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 

rq\ncpfiiagetesladivlwgalypllqdpaylpeblsalhsw 
fqtlstq\epcqr\aarrlvlkq\qgvlalr\pylqkqpqpspa 
egkglspiepe3eelatlseeeiamavtawekgleslpplrpqq 
npvlpvagernvl itsalp y vnnvphlgn i igc vlsadvfarys 
rlrqwntlylcgtdeygtatetkal\eegltpoe icdkyhi iha 
diyXrwfnisfdifgrtttpqqXtkitVqdifqqllkrgfvlqd 
tve qlrcehcarf \ ladrf veg vc p f cg yeeargdqcdkcgkl i 
navelkkpqckvcrscpwqssqhlfldlpklekrleewlgrtl 
pgsdwtpnaqfitpffgfrewpskprwq*trdlk\wgnpgtp*e 
gfedk\vfyvwfdatigylsitanytdqwerww\knpeqvdlyq 

FM \ AKDNVP FHS L VFPSSALGAEDNYTL\ VSHL I ATE YLN YEDG 
K\FSKSRGVGVPRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLLKNNS \ELLNNLGNFINRA\GMFVS KFFGG\ YVPEMV 
XjTPDDQRLLA\HVTIiELQIIYIIQ\LLEKVRIRDALRS ilti s \ rh 
GNQYI\QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
QPYMPTVSAT IQAQLQLP PPACS I LLTNFLCTLPAGHQIGTVSP 
LFQ KL ENDQ I ES LRQRFGGGQAKTS PKPAWETVTTAKPQQ I QA 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKJC 


5B96 


29*7 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\ WQGKKG \EDVLGSVRRTLTH IDHS LS 
RQ \ NCP FLAGE TES LAD IVLWGALY PLLQDPAYLPEELS ALHS W 
FQTLSTQ\EPCQR\AARRLVl4KQ\QGVLALR\PYLQKaPQPSPA 

egkglspiepeeeelatlseeeiamavtawekgleslpplrpqq 
npvlpvagernvlitsalpyvnnvphlgniigcvlsadvfarys 
rlrqwntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 
diy\rwfnisfdifgrtttpqq\tkit\qdifqqllkrgfvlqd 
t veqlrc ehcarf\ ladrfveg vcp fcg yeeargdqcdkcg kl i 
navelkkpqckvcrscpwqssqhlfldlpklekrleewlgrtl 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\ VFYVWFDATIGYLS I TAN YTDQWERWW\ KNPEQVDLYQ 
FM\AKDNVPFHS LVFPSSALGAEDNYTL \VSHLIATEYLNYEDG 
K \ FS KS RGVGV FRDM\AHDTG I PPD IS R FYL\L Y I RPEGK\ DS A 
FSWTDLLLKNNS\ELiLNNLGNFINRA\GMFVSKFFGG\YVPEMV 

ltpddqrlla\hvtlelqkyhq\llekvrirdalrsiltis\rh 

GNQYI\QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVT4Ii 

qpymptvsatiqaqlqlpppacsilltnflctlpaghqigtvsp 

LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correaponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alani nP r-PVPhpl ri(» n-Bcnart-lr f ■? ri 17 

n*oiniic, i»-^ysuciJie, u-Asparcic ACiu, js» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, lalsoleucine, K=Lysine, 
T.a Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V»Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


£897 


29*7 


86 


HPSIXGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIKGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESIiADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLS PI E PEEEELATLSEEEI AMAVTAWEKGLBSLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI I G CV LS ADVFAR Y S 
RLRQWNTLYLOGTDEYGTATETKAL\EEGLTPQEICDKYHIIIIA 
DI Y \RWFN I S FD I FGRTTTPQQ \ T KI T \ QD I FQQLLKRG FVLQD 
TVEQLRCEHCARF\LAI)RFVEGVCPFCGYEEARGDQCDKCGKLT 
NAVELKKPQCKVCRSCPVVQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP»E 
GFEDK WFYVWFDATIG YLS I TANYTDQWERWW\KNPEQVDLYQ 
t m \AADNVPt HSLVFPS SALGAEDNYTI>\VSHLIATEYIjNYEDG 
X\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLLKNNS \ELLNNLGN FINRA\GMF VS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\ LLEKVR I RDALR S I LT I S \ RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGIiAVNIAALLSVML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHOIGTVSP 
LFQKLENDQ I ESLRQR FGGGQAKTS P KPAWET VTTAKP QQ I QA 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKTJaDLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPSuLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFlNQRGIHGE" 
MPXFVSDGVPGCLPVIJU\AGRARGRAEVLISTVGPEDCVVPFLT 
RP KVP VLQLDS GN YLFSTSA I CR Y FF\LLS GWEQDDLTNQ WL E W 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ \ E PCQR \ AARRL VL KQ \ QG VLALR\ P YLQ KQ PQPS PA 

egkglspiepeeeelatlseeeiamavtawekgleslpplrpqq 
npvlpvagernvlitsalpyvnnvphlgni igcvlsadvfarys 
rlrqwntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 

DIY\RWFNISFDIFGRTTrPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\liADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLBEWLGRTL 
PGS DWTPNAQFITP FFGFREW PS K PRWQ* TRDLK\WGNPGTP * E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\ AKDNVPFHSIiVFPfi *5 AT/JZVFnNVTTA vcur.TaTPVT vTvcrv^ 

K\ FS KSRG VG VFRDM \ AHDTG I P PD I SR FYL\ LY IRP EGK \ DS A 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTI3\RH 
GNQYI \QVNE PW\KR IKGSEADRQRAGTVTGLAVNIAALLSVML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGKQIGTVSP 
LFQKLENDQlESLRQRFGGGOAKTSPKPAWETVrTAKPOOTfia 
LMDE VTKQGN I VRELKAQKAD KNEVAAEVAKLLDLXKQLA VAEG 
KPPEAPKGKKKK 


5899 ' ' 


326 


1078 


NCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ " 
EANEKAEEIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQ KKI LMS TMRNQ ARLKVLRARNDL I S DLLS EAKLRLS RIVED P 
E VYQGLLDKLVLQGLLRLLE P VM I VRCR P \QD LLLVEAAVQ KAI 
PEYMTI SQKHVEV\ QI DKEA*LAVECS WEW7E VYSGNQRI KVSN 
TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


64 ■ 


1409 


KAASRDSPCLEFCPLCGVSSHDLQHRMWYHRLSHLHSRLQDLLK 
GGVIYPALPQPNFKSLLPLAVHWHHTASKSLTCAWQQHEDHFEL 
KYANT VMRFDYVWLRDHCR9ASCYNSKTHQRSLDTASVDLCI KP 
KTI RLDETT LF FTW PDGHVTKYDLN HLVKNS YEGQKQKVIQPR I 
LWNAE I YQQAQVPS VDCQS FLETNEGLKKFLQNFLLYGI AFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTT Y FQE P CG I QVFHCLKHEGTGGRTLL VDG F YAAEQVL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alail i HP. C* a fv qf-pi no n— A ana -yf i /-» sQ B» 

Glutamic Acid, ^Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R-Arginine, 
S«Serine, T-Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X=> Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q KAP BEFELLSKSAI\ KHEY I ED VGECHQPHD WDW AQS * t STHG 
/YKE LYL I R YNNYDRAVINTVP YDWHRWYTAHRTLTI ELRRPE 
NEFWVKLKPGRVLPIDNWRVLHGRECFTGYRQLCGCYLTRDDVL 
NTARLLGLQA 


5901 




2121 


VAI EQTSLKMMQAVGGAPARPTG EY I CNQCGAKYTS LDS FQTHL 
KTHLDTVL? KLTCPQCNKEFPNQESLLKHVTI HFM ITSTY YICE 
SCDKQFTSVDDLQiUILIiDMHTFVFFRCTLCaEVFDSKVSIQLHL 
\AVKHSNEKKVYRCTSCITODFPJIETDIjQLHVKHNH1£N(>GKVHK 
CIFCGESFGTEVELQCHITTHSKKYNCKFCSKAFHAirLLEKHL 
REKHCVFETKTPNCXSTNGASEQVQKEEVELQTLliTNSQESHNSII 
DGSEEDVDTSEPMYGCDICGAAYTMETLLQNHQLRDHNIRPGES 
AIVKKKAELIKGNYKCNVCSRTFFSENGLREHMQTHLGPVKKYM 
\.r xsAJCKr tfouuiLii tnKVlHoK^LL/rGWCRICKMPLQSEEEFIi 
EHCQMK PDLRNS LTGF^CWCMQTVTSTLELIGCHGTFHMQKTGN 
GSAVQTTGRGQHVQKLYKCASCLKEFRSKQDLVKLDINGLPYGL 
CAGCVNLSKSAS PGINVPPGTNRPGLGQNENLSA I EGKGKVGGL 

QVS PMPRIS PSQSDEKKTYQCIKCQMVFYNEWDIQVHVANHK ID 
EGLNHECKliCSQTFDS PAKLQCHLIEHS FEGMGGTFKCP VCFT V 
FVQANKLQQH I FS AHGQEDKI YDCTQCPQKFFFQTELQNHTMTQ 
HSS 


5902 


712 


209 


L KNRRRS RPS I RQS I GSTS VS RWLTS LFTY LDHTAD VQ * V* RE F 
IPLXPRQ* ED* MFQSWLHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VE PLQTVE VETQGDDLQS LLFK FLDEWL YKFSADE FFI P \GWGE 
EFS LS KKPQGTE VKA I TYS AMQVYNE EN PE VFVI ID I 


5903 


210^ 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT 
PALFALSAVPGGAASPMPPSGIiRLLPLLLPLLWLLVLTPGRPAA 

LPEA VLALYNS TRDR VAGESAE P E PE PEAD YYAKEVTRVLM VET 
HNE I YDKFKQSTHS I YMFFNTS ELRE AVPEPVLLSRAELRLLRL 
KLKVEQH VEL YQK YSNNS WRYLSNRL LAP S DS PE WLS FD VTG W 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATI HGMNRPFLLLMATPLERAQHLQS \SRHRQAL\DTNY\ CFS F 
HGGRNCLRC/VHC*HLI FRKDL\GW\ KMI \HE\ PKGYHANFC\L 
G PCPY I WSLDTQYSKVLAL YNQ\HKPG\ASAAP \ CCVPQALEP \ 
LPIVYY\VGRKPKVEQLSNMIVRSCKCS 


5904 


3 


1126 


ALGNS ETE KAFRAI S S KVP VDKVTP S TLPEE VLD FEKFLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKES IQI WKTKKQQKREE I F KLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKS I EMSMKCASQL 
KBEEEKEKKHQKERQRQFKLKILLESYTQQKKEQEEFLRIiEKEI 
RBKAEKAEKRKNAADEISRFQERDLHKLELKILDRQAKEDEKSQ 
KQRRLAKLKEKVENNVSRDPSRLY/NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MASFPPRVNE^tVRLRTtGEGLAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGRI KI WDVYTGKLLLNLVDHTG VVRDL 
TFA PDGSLI L VS AS RD KTLRVWDLRDDGN\MMKVLRGHQN WVY \ 
SCAFS PDSSMLCSVGASKAWAAILV* LRLCWHHSHT3ATMVLS 
WAE R VAS LATG LGATFT IG * SNLAFVLQG VL YVHRCWSM S T FCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSBTNLM*SI 
WLSNGFSVLFFGII»SDSRDILRL*FNLXFVLIFF*K*CIVSVQK 
KKKPKRIALLQEERLS*DKPPSSHLI*QTEVNIRIIjFRAILHS* 
LLI FR I *NCI * T YS * I IDP FYIQMTYDRG*FGKNKMVKF* F I EM 
*LYYFHKIAFSFCNW*HPCCLPKKFHLAVNILFACSICFSS*A 
QVGDPSLL*TSDYLKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL*YLTLFISVYFS*LVFGINGFUYSFWKLHCLYFMFRLI 
FKLTFNRNI*NRICMSALINLKTDFNLTMTLSIFFKLLIIYNA* 
YNLN*I*QF*YKMCHFVI>CMSE*SYNICLFIAGF\LWNMDKYTM 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P«Proline, G>Glutamine, RoArginine, 
S«Serine, T=Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








i RKLEGHHHDWACDFS pdgallatas ydtrvyiwdphngdilm 
efghlfppptpifaggandrwvrsvsfshdglhvasladdkmvr 
fwridedypvqvaplsnglccafstdgsvlaagthdgsvyfwat 
prqvpslqhlcrms irrvmptqevqelp ipskllefls yri 


5906 


146 


2038 


regagsgrmasga\ynpyieiieqprqrgmrfrykce£rsags1 
pgehstdnnrtyps i q i mnty yg kgkv\r i tlvtk\ndp ykjphph 
dlvgkdcrd\gyyeaefgqe\rrp\lffqn\lgircvkkkbvke 

A\ I ITR\ I KAG INPFDVP*KQLNDIEDCDLDWRLWFRVFLPDG 
HGNL\TTALPPV\VSSPIYDNRAPNTAELRVCRVNKNCGSVRGG 
DEI FLLCDKVQKDDI EVRFVLNDWEAKG I FSQAD VHRQVAI VFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QS AG I TVN FPER PR PGLLGS IGEGR YFKKE PNL FSHDA WREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSS FSTRTLPSNSQGI PPFLRI PVGNDLNASNACI YNN 
ADD I VGMEAS SM PSADL YG ISDPNMLSNCS VNMMTTS S D SMGET 
DNPRLLSMNLENPSCNSVUDPRDLRQLHQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQUSQY 
SGIGSMQNEQLSDSFPYEFFQV 


5907 


99 


1873 


TYLLSSVJSS * *NLDT£iKSQVKV/kmtikki$tipYP$PAKQlXGK 
KATSKVPSAPHFVHPNDHANREAELKXKWVEEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQELNMFPQLDDEATRK 
AYYKEFRKWEYSDVILEVLDARDPLGCRCFQMEEAVLRAQGNK 
KLVLVLNKIDLVPKEWEKWLDYLRNELPTVAFKASTQHQVKNL 
NRCSVPVI^ASESLLKSKACFGABNLMRVLGNYCRLGEVRTHIR 
VGWGLPNVGKS5LINSLKRSRACSVGAVPGITKFMQEVYLDKF 
IRLLDAPGIVPGPNSE VGT I LRNCVHVQKLADPVTPVET I LQRC 
NLEEI SNYYGVSGFQTTBH FLTAVAHRLGKKKKGGLYSQEQAAK 
AVLADWVSGKISFYIPPPATHTLPTHLSAEIVKEMTEVFDIEDT 
EQANEDTMECLATG E SDELLGDTD PLEMEI KLLHS PMTK IADAI 
ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMETDPLQQGQAIiASALKNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCG I KKRGEGSG S PS PASG G FQLG CQ I P 3 PSLPS EEETtt PHTRA 
HTRTLRATLTRRPPRSHSTRLRFPMPLDGDGGLASWK/PMRER* 
GWRR PAKAAGAS LG VAATGKRGCRMS KRYLQKATKGKLLI IIP! 
VTLWGKWSSANHHKAHHVKTGTCEWALHRCCNKNKIEERSQT 
VKCS CFPGQVAGTT RAAPS CVDAS I VEQ KWWCHMQ PCLEGEE CK 
VLPDRKGWS CSSGNKVKTTRVTH 


5909 


1 


5002 


PAIPGSTI IWAPGSHSAARADGRHGSIiPSQSQAPGALCGARAPP ' 

SSNLRADRSMICAQARAGKNIiYHNRFLGLAAMAFPSRNSQSLRR 

CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 

STLMGRSS YYS IGHSQDLVIHWDIKEBVDAGDWIGMYL I DEVLS 

ENFLDYKNRGVNGSHRGQIIWKIDASSYFVEPETKICFKYYHGV 

SGALRATTPS VTVKNSAAP I FKS IGADET VOGQGSRRL I S FSLS 

DFQAMGLKKGMFFNPDPYLKISIQPGKHSIFPALPHHGQERRSK 

I IGNTVNP I WQABQ FS F VS LPTDVLE I EVKD KFAKS R P 1 1 KR FL 

GKLSMPVQRLLERHAIGDRWSYTLGRRLPTDHVSGQLQFRFEI 

TSSIHPDDEEI SLSTEP ES AQ I QDSPMNNLMESGSG E PR S EAPE 

SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 

VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGEEASALLLE 

DGEAPASTKEEPLEEEATTQSRAGREBBEKEQEBEGDVSTLEQG 

EGRLQLRASVKRKSRPC3LPVSELETVIASACGDPETPRTHYIR 

IHTLLHSM PSAQGGSAAEEEDGAEEESTLKDSSEKDGLS EVDTV 

AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 

HPSTGSESDSS PRQGGDHSCEGCDASCCS PSCYSSSCYSTSCYS 

S S C YS AS C YS PSCYNGNRFASHTRFSS VDS AKI S ES TVFS SQDD 

EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 

SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 

IDEPLPPNWEARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 

RSGS I QQMEQLNRR YQN I QRTIATBRSE EDSGSQS CEQ APAGGG 
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amino acid 
residue of 
amino acid 
sequence • 


1 Predicted end 
nucl eot i dp 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Amino acid segment containing signal peptiHe • 
(AsAlanine, C=Cystoine, D-Aspartic Acid, E- 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K^Lysine, 
L-Leucine, M*Methionine, N=Asparagirie, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
VNTryptophan, Y=Tyrosine, X=Unknown, +»stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


5910 






GGGGSDSEABSSQSSLDLRREGSLSPVNSQKITLLIjQSPAVKFI 
TN P E FFTVLHANY SA YRVFTSSTCLKHMILKVRRDARNFER YQH 
NRDLVNFINMFADTRLELPRGWE I KTDQQGKS FFVDHNSRATTF 
IDPRI PLQNGRLPNHLTHRQHLQRLRSySAGBASEVSRNRGASL 
LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLQE 
RQPSLARNHTLREKIHYIRTEGNHGLEKLSCDADLVILLSLFEE 
EIMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 
iKRDFEAKLRNFYRKLEAKGFGQGPGKIKLI IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSRBFFFLLSQELFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\LALI 
HQ YLL DAFFT \RP F YKALL \ RLPC \ D\ LSDLE YLDE EFHQS LQW 
MKDNNITDILDLTFTVNEEVFGQVTERELKSGGANTQVTEKNKK 
E YIBRM VKWR VERG WQQTEAL VRGF YEWDS RL VS VFDARELE 
LVIAGTAEIDLNDWRNNTEYRG5YHDGHLVZRWFWAAVERFNNE 
QRLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRFLP^KKWGKITS 
L P PRG \ HTCLQPD WDL PTVS PRTPML YE K \ LLTA\ VEETST FGT 


5911 


1526 


446 — 


VAEFAAMEPGRTQIKLDPRYTADLLBVLKTNYGI PSACFSQP^T 

AAQLLRALGPVELALTS ILTLLALGS 1 AI FLEDAVYLYKNTLCP 

I KRRTLL WKS SAPTWS VLCC FGLWI PRS LVLVEM TI VS PYAVC 

FYLLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 

LLTRKKIjQ\R*CWALSNTPS*R*R*PWWACFSSPTASMTQQTFL 

RGAQLYGSTLSSA/CSTLLALWTLGI ISRQARLHLGEQNMGAKF 

ALF<JVLLILTALQPSIFSVIJu^GGQIACSPPYSSKTR3QVMNCH 

LLILETPLMTVLTRMYYRRKDIIKVGYETFSSPDLDLNLKALRWM 
AWTMKGCCTH 


5912 


109 


595 


q^plapciqgkglemrspkpqsfiirsshsgagllvknpstpVf - 

CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GT I S AHCNLR L P S S SNS PAPAS * LAG I TG VCHHAQL 1 F VFL VET 
GFHHVGQAGLELL/NWTHIiPRPPKVLGLQA 




924 " 


277 


MIl^KALMLUALALTTVMSPCGGEDIVADHVASYGVNLYQSYGP 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
IAVLKHNI^IVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNITWLSNGHSVTEGVSETRPSSPKSDHFr^QDQ 
VTSPSFPFE* *DL*TAKVEQLGAWFEPLLKHWGAE I PTTL 


5913 
5914 


46 


1198 


QLRMAGAEGAAGRQSELEPWSLVDVLEEDEELENEACAVLGGS 
DS E KCS YS QG S VKRQAL YACS TCT PEGEE PAG I CLACS YECHGS 

HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAXPPE 
SGDFQEKVCQACMKRCSFLWAYAAQLAVTKIST\GMMDWCGTLM 
E*/DDQEVIKPENGEHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGSSSESDLQTVFKNESLNAESKSGCKLQELKAKQLIKKDTAT 
* ifUHHRoftijL i MjLit.roKM zuULD VLFLTDEYDTVLAYENKGKI 
AQAT3RSDPLMDTLSSMNRVQQVELIC/GIQ»FED 


| 5915 


960 


124 


NLGGSELPPEEALFIQVASMNQRRVDFYIjAS IEDMLVAI /GGRN 

EJJGAIiSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 

GHDYQIGPYRKNLLCYDHRTDW7EERRPMTTARGWHSMCSLGDS 

I YS XGGSDDNIESMERFDVLGVEAYSPQCNQWTRVAPLLHANSE 

v a v w CA7« l r I ix^ YS WENTA FS KTVQ V YDREADKWS RG VDLP 

KAIAGGSACFIAP*SLGQRTRKRKAXARGTRTGASDPSCASWDH 
PHRHL PGLCR PAATS 


. 5916 


1604 


703 f- 


FPGRPTRPLKLGRRRKRARI IUAPHCUSPRPRTCPPGALQAPEA 
PASRAEGPVAVWNGHTEG PAPARS AP KE P PGL PR PLGS FPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGL* + AAGPAAH 


5917 " 


256 
1343 


J 

827 [ j 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
rVTG A VH RHLNHV AG I IPWVLHSQLKPTAATAQDQWTSQQYPDH 
?TRLILQ*NQATADKNN*TTALLQPHQRL\VSPRMAEA 
\HQILTYLEP/ICLWNYUKIIiTVFLTKSVLEI*KFIHTPQTYR 
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Predicted end 
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location 
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amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=5 Alanine, C=Cysteine, D=Aspartic Acid, E=- 
Glutamic Acid, F» Phenylalanine , G=*Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
LsLeucine, M^Methionine, N«Asparagine, 
P»Proline, Q~Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W "Tryptophan , Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








?*NDPFGlKEVyVSRRLRKTSP/kLAVTFLEQAWSKECVPVDQ 
FMEHLL P S L LS LAS D P VPNVRVLLAKALR QMLIjE KAY FRNAGNP 
HLEVIBETILALQSDRDQDVSPPAALEPKRRNIIDTAVLBKQN 


5918 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGR R RMET P F YGDE ALSGLGGGASGSGGTFAS P GRLFPG 
A? PTAAAGS MMKKDALTLS LS EQVAAALKPAPAP AS Y p P A\ ADG 
APSAAPPDGLLASPDLGLLKLASPELERLI IQSNGLVTTTPTSS 
Q FLYP KVAAS EEQE FAEGFVKALEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGS AP PG ELAPAAAA P EAP V YA\ NLSS Y \ AGGCRGL 
RGG AAT \ VAF AAE PVP FPPP P P PG ALG PRR? / RLALQGRR PQT V 
PD VP \ S FGES P\ PLS P I ET\ DTPRR I \ KAKRKRL\ RNPQI RAP K 
PASRKLGAQSRALERESEDPS * S PEHGSLASTASLLREQVAQLX 
QKVLSHVNSGCQLLPQHQVPAY 


5919 


1 


4254 


TS VQGDSQGTPTSS QGS INME H W I S QA I HGS TTSTT S S S STQSG 
GSGAAH RLADVMAQTH I ENHSAP P D VTT YTS E HS I Q VERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 
PPSLE AALQRWGTISP KAPCLTTMDTNG KPLY I LTYGKLWTRS M 

kvays i lhklgt kqep m vrpgdrvalv f p nnd p aafmaafygcl 
laewpvp i e vpltrkd agsqq ig fllgs cgvtvaltsdachkg 
lpks ptge i pqfkgw pkllwfvte s khls kp prdwf \ ph i kdan 
ndtayieyktck\dgsvlgvtvtrtallthcqaltqacx;yteae 

TIVNVLDFKKDVGLWHGILTSVMNMMHVISIPYSLMKVNPLSWI 
QKVCQYKAXVACVKSRDMHWALVAHRDQRDINLSSLRMLIVADG 
ANPWS ISSCDAFLNVFQS KGLRQEVI CPCASSPEALTVAIRRPT 

ddsnqppgrgvlsmhgltygvirvdseeklsvltvqdvglvmpg 
aimcsvkpdgvpqlcrtdeigelcvcavatgtsyyglsgmtknt 
fe vfamtssgap i se ypf irtgllg fvgpgglvfwgkmdglmv 
vsgrrhnaddivataiavepmkfvyrgriavfsvtvlhderivi 
vaeqrpdsteeds fqwmsrvlqai ds ihqvgvyclalvpantlp 

KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PBIGPASVMVGNLVSGKRIAQASGRDLGQIEDNDQARKFLFLSE 
VLQ WRAQTTP DH I LYTLLNCRGA I ANS LTCVQLHKRAE KI AVML 
MBRGHLQDGDHVALVYPPGIDLIAAFYGCLYAGCVPITVRPPHP 
CNI ATTLPTVKMI VEVSRSAGLMTTQL I CKLLRSREAAAAVDVR 
TWPL I LDTDD * PKKRPAQ I CKPCNPJDTLAYLDFSVSTTGMLAGV 
KMSHAATSAFCRS I FCLQCELYPSRE VAICLDPYCGLGPVLWCLC 
SVYSGHQSILIPPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CTKGLGS QTES LKARGLDLSRVRTCWVAEERPR IALTQS FS KL 
FKDLGLH PRAVSTS FGCRVNLAI CLQGTSGPD PTTVYVDMRALR 
HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 
LGE I WVHS AHNAS G Y FT I YGDE SLQSDH FNSRLS FGDTQT I WAR 
TGYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID '. 
IETSVIRAHKSVTECAVFTWTNLLWW2LDGSEQEALDLVPLV 
TNWLEEHYLI VGVWVVDIGVI P INSRGEKQRMHLRDGFLADQ 
LDPIYVAYNM 


5920 


1381 


1499 


QLGAVAHAG VS R I PP * LF P PLHPTFLSLWCLHHKL P / HP PGASM " 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRVAIXIGTQTTGP 
SGVGCC7PGRPLLPCKCSSAAHSTYRVQBPAVHIPGQEPLTASM 
LAAAPLHEQKQMIGERLYPLIHDVHTQLAGKITGMLLEIDNSEL 
LLMLES PESLHAKI DEAVAVLQAHQAMEQP KAYMH 


5921 ; 

5922 | 


727 
2475 


157 
495 


VCPGTGGE * GLWGQLGGLPKE T PLK PMDAFTGSGLKRKFD D VD V 
GSSVSNSDDEISSSDSADSCDSLNPPTTASFTPTSILKRQKQLR 
RKNVR FDQVTVY YFARRQG FTS VPSCGG S SLGMAQRHNS VRS YT 
LCEFAQEQEVNHREILREHLKEEKLHAICKMKLTKNGTVESVEAD 
GLTLDDVSDEDIDVENVEVDDYFFLQPLPTKRRRALLRASGVHR 
I DAEEKQELRAI RLSREECGCD CRLYCD PEACACS QAG I KCQ VD 
RMSFPCX5CSRDGCGNMAGRIEFWPIRVRTHYLHTIMKLELESKR 
Q\GAAQQPQ\+GALPDCQLQPDRSTGL*DPSWIGSKGLSFTGKG 
AAATHLI ILRVIENRGAEGKRK 

S YSNWGLFPS VFIQVPRSRTGNLKPXFLETS YYE \CMETLkG\T 
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location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
»" ™.wiixHc, ^-^/oteinc, u=Asparcic Acid/ E= 
Glutamic Acid, F=. Phenylalanine. G=Glycine, 
HaHietidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrbsine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CLYNATQYK^CSPRNDRPDACyNPSEPAATTVFEIRTGLIiLGDT~ 
SKIITRTEEKEIPKQITLRFDACAAINSKKLEIGCGSLN*ERS* 
RVENKYVCHESGVCKNCAYWPCVI * AT* KKNKNDS VYLQKGEAN 
PSCAAGHCNPLEI/I ITNPLDPHWKKGERVTLG INRTGLKPQWI 
LIKGEVHKCSPKPVFQTPYEEI^PAPELLKKTKNLFIiQLAENV 
I FLLNGTS CYVRGGTTIGDRWPWEA* ELVPTDPAPDI 1 P I * KAE 

ASNF^VLKTSIIROYClABFfJKnPT TV>\rr , VVKtr , Ti-nwr vrni«rrr 

TIT**DLNHTEKNPFSKFSKLKTA*AHAESH*DWTVPSGLY*IC 
RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKMGELLGFSVYASR 
E KKG I V IGNW KDNEW PRERI IQY YG P ATWAQDG S WG YR / TP/ VY 

MLNWIIRLOATLFT TQNTTTY2DAT TUT 7Vt«mt?'TV\M»*T'lV-rw\ttrr»r t> -r 

i iMti ivjj^^-vi jjc. j. x o« a i KstUMj i v JjAWU e 1 yPIRNAX YQNRliAL 
DYLLVAEGG VCR KFNLTNCCLQ I NDQGQ WKN I VRDMTKLAH VP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 
PLLFQM I KG I VATL VHQKTS AHVNYMNHYRS I SQRDSKS EDE S E 
NSH 


5923 


137 


638 


QLCGRRGQR FRTS I KRMHPI * RTCPfoTNL/ 1 1 LtSQENTQI RW 
QQENRELWISLEEHQDALEIilMSKYRKQMLQLMVAKKAVDAEPV 
LKAHQSHSAE I ESQI DRI CEMGEVMRKAVQ VDDDQ FCKIQEKLA 
QLELENKELRELLS I SS ESLQARKENSMDTASQAI K 


5924 

5925 " 


274 


2146 


EKGKVKDAOAEQWI SLS LSCKGSWETQFSNHLNS LTPPTS VRRM 
PLITTVTLLKMVARHHKKLLCSKAFSTQLQQKIFLHSQMGIHHQ 
SVCMKLKPNTSHI ISILMGQPMALVQLETLAPTjTI I IQKFQTQD 

hmkfwknlplhshhltps vpqtvi pkktgspe i klk i tkti qng 
relfesslcgdh»nkvqase\q*nqsiesrkekrkksnkkdssr 
seerkshkipklepeeqnrpnervdtvsekpreepvlkegspss 

■Hix a j. r t^iwwtas VHW \r KFQvGDLVWSKVGTYPWWPCMVSSDPQL 

evhtkintrgareyhvqffsnqperawvhekrvreykghkqyee 

LLAE ATKQASNHS EKQK I R KPRPQRERAQ WDIG 1 AHAEKALKMT 
REER1EQYTFIYIDKQPEEALSQAKKSVASKTBVKKTRRPR5VL 
NTQP EQTNAGE VAS S LSS TE I RRHSQRRHTS AEE EEP PPVKI AW 
KTAAAR KSLPAS ITMHKGSLDLQKCNMS P WK I EQVFALQNATG 
DGKFIDQFVYSTKGIGNKTEISVRGQDRLIISTPNQRNEKPTQS 

VSSPEATSGSTGS VEKKQQRRS IRTRS ESEKSTEWPKKKI KKE 
QVGFLHVES 




21* 


1911 


MMTAESREATGLS PQAAQEKDGI VI VKVEEEDEEDHMWGQDS TL 
QDTPPPDPEIFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQIIjELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 
DIjELDLSGQQVPGQVHGPEMLARGMVPLDPVQESSSFDLHHEAT 
QSHFKHSSRKPRLLOSRALP212lWTDaDDUE'r , cnDnn»MAe7VT -nm 

ADSQAMVKI 3DMAVS L I LEE WGCQNLARRNLS RDNRQENYGS AP 
PQGGENRNENEESTS KAETQV n qic or" o tty^o cAvrD^ciimnAo 

GKTGERQQKNPEEKTRKEKRJD5GPAIGKDKKTITGERGPREKGK 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRHK 
I IHTGEKPYECSECGKAF\SI*NS \NLVLHQRI \HTGEKPHECNE 
CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
RIHTGEKPYECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 
\AFTRSSTLTLHHR I HARERASEYS PASLDAFGAFLKSCV 


5926 
" 592^- 


2 


233 


DRCLMLKQGSQPGSPPAT/C3PPAPPVYQAPCQSCPEPPGAIIEP ~ 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 




414* 


1246 

• 


KHFS KFGSQ All YQLKRPASGQNS I S VM PAQKI TKPAAKYG I PliA * 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKREIYGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMG I LQMLAAMYGGRPS S S RGGKPRNKE EEV 
YIJulLRQIRLQNFNERQQrKAKLRGEKKEANHSEGQEGSEEAJDM 
RRKK\IESLKAHANARAAVLKEOLERKRKEAYEREKKVWEEHLV 
AKGVKSSDVSPPrjGQHETGGSPSKQQMRSVISVTSALKEVGVDS 
SLTDTOETSEEMQKTIWAJCSSKREILRRLNENLKAOKDEKGKQN 
LS DTFEI NVHEDAKEH E KEKS VS SDR KKWEAGGQLVI PI*DELTL 
DTS FSTTERHTVGE V I KLGPNGS PRRAWGKS PTDS VLKI LGEAE 
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sequence 


Predicted end 
nucleotide 
location 
c or r e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
l " f "- c *"- Li "=, u-uysceine, D=Aspartic Acid, B=> 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I«Isoleucine, iULysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P»Proline, Q=Glut amine, R«Arginine, 
Sa Serine, T=Threonine, V^Valine, 
WoTryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








LQIjQTELLENTTIRSE I S PEGE KY K PL I TGE KK V(jC I SHE I N PS 
AI VDS PVET KS PEFS EAS POM S L KL EGKTLEE o DnT, PTP t t /mtd c 

GTNiCDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VS STVDQLS D IHI EPGTNDS QHSKCDVDKS VQ PEP PFHKWHSE 

HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKWKNSLLlGIiSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDEN I KEG PS DS ED I VFE ETDTDLQELQASMEQLLREQPG E E YS 
EEEESVLKNSD7EPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLKLEQEMGFEKFFEVYEKIKAIHE 
DED ENI E I CS KI VCNT LGNEHQHLYAKI LHLVMADGAYQEDNDE 


5928 


4146 


1248 


KHFSKFGSQAL.YQLKRPASGQNS I SVMPAQKITKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHXQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNEAKWKRE I YGRGLPE RQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEEEV 
YbARLRQ I R LQNFNERQQ I KAKLRG EXKEANHS EGQEGS EEADM 

RRKK\IESLKAHANARaAVLKEQLERKRKEAYEREKKVWEEH1jV 
AKGVKS SDVS PPLGQHETGGS PSKQQMRSVIS VTSAXjKEVG VDS 

sltdtrstseemqktnnaisskreilrhlnenlkaqedbkgkqn 
lsdtfeinvhedakehekeksvssdrkkweaggqlvipldeltl 
dtsfstterhtvgeviklgpngsprrawgksptdsvlkilgeae 
lqlqtellenttirseispegekykplitgekkvqcisheinps 

v f va i m> Pti FSEAS PQMS T>KLEGNLEEPDDLETE ILQEPS 

gtnkde\slpctitdvwiseefo2tketqsadritiqenevsedg 
vsstvdqlsdihiepgtndsqhskcdvdksvqpepffhkwiise 

HLNLVPQVQS VQCS PEES FAFRSHSHLP P KNKNKNS LL IGLS TG 

lfdannpkmlrtcslpdlsklfrtlmdvptvgdvrqdnleidei 
edenikegpsdsedivfeetdtdlqelqasmeqllreqpgeeys 
eeeesvlknsdveptangtdvadeddnpssesalneewhsdnsd 
geiasececdsvfnhleelrlhleqemgfekffevyskikaihe 
dsdenieicskivqnilgnehqhlyakilhlvmadgayqednde 


5929 


3 


1558 


ldfsmttqlpayvai llfyvs rascqdtftaavyehaailpnat 
lttvsreeai^mnrnldilegaitsaadqgahiivtpedaiyg 

WNFNRDSLYPYLEDI PDPEVNWI PCNNRNRFGQTPVQ2RLSCIA 
* nvitiiw j. j. v *ruN4.«i^iv^t'i - ijiaiJryulrl?lJv5RxQYNTDVVF\DSQG 

klvaryhkqnlfmgenqfnvpkepeivtfnttfgsfgiftcfdi 
lfhdpavtlvkdfhvdtivfptawmnvlphlsavefhsawamgm 

RVNFLASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 

qldshpshsawnwtsyassiealssgnkefkgtvffdeftfvk 

LTGVAGNYTVCQKDLCCHLS YKMSEN I PNE VYALGAFDGLHTVE 
GRYYLQ2CTLLKCKTTNLNTCGDSAETASTRFEMFSI>SGTFGTQ 
YVFPEVLLSENQLAPGEFQVSTDGRLFSIiKPTSGPVLTVTLFGR 

lyekdwasnassgl?aqariimliviapivcslsw 


5930 


113 


6082 


rgncfwivpftmaqrtgledperylfvdraviynpatqadwtak 

KLVWI PS ERHGFEAAS I KEERGDEVMVELAENGKKAMVNKDDI Q 
KMNPPKFSKVEDMAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 

cwinpyknlpiyseniiemyrgkkrhempphiyaisesayrcm 

LQDREDQS ILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 

ipgevlerqllqanpilesfgnartvqndnssrfgkfirinfdv 
tgyivganietylleksravrqakdertfhifyqllsg\agehl 
ksdlllegfnnyrflsngyipipgq\qdkgnfrgdpgear^himg 
fsheeilsmlkwssvlqfgnisfkkerntdoasmpentvaqkl 
chllgmnvmeftrailtprikvgrdyvqkaqtkeqadfaveala 
katyerlfrmlvhrinkaldrtkrqgasfigildiagfeifeln 
sfeqlcinytneklqqlfnhtmfileqeeyqregiewnfidfgl 
dlqpcidlierpanppgvlalldeecwfpkatdktfveklvqeq 
sshskfqkprolkdkadfciihyagkvdykadewlmknmdplnd 
wvatllhqssdrfvaelwkdvdrivgldqvtgmtetafgsaykt 

KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 



403 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted" 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

1 amino acid 

1 sequence 


1 Predicted end 
nucleotide 

I location 
corresponding 
to first 

j amino acid 
residue of 
amino acid 

| sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, ^Phenylalanine, G<=Glycine, 
H=JIistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S=Serine, T-Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknovm, *=*stop 
Codon, /^possible nucleotide deletion, 
_ \=PQ3sible nucleotide insertion) 


5931 






WAI PKGFMDG KQAC BR M I RAL EliD P NtLYR IGQSKI P F RAG VLAH 

LEKKRDLKITDIIIFPQAVCRGYLARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 

HKQTKVEGELEEMERKHQQLLEE KNILAEQIjQAETB L FAEAE EM 

RARLAAKKQELEE ILHDLSSRVEEEEERNQILQNEKKKMQAHIQ 

DLEEQLDEEEGARQKIiQLEKVTAEAKIKKMEEEILLLEDQNSKF 

I KE KKLMEDR I AECSS Q LAEE E E KAKNLAKI RNKQEVM I S DL EE 

RLKKEE KTR QE LE KAKR KLDGETTDLQDQ IAELQAQ I DE LKLQL 

AKKEEELQGAliARGDDETLHKNNALICWRELQAQIAELQEDFES 

EKASRNKAEKQKRDLSEELiEALKTELEDTLDTTAAQQELRTKRE 

QEVAELKKALEEETKNHEAQIQDMRQRHATAliEELSEQLEQAKR 

FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 

QELHAKVS EGDRLRVELAE KAS KLQNELDNVSTL LEE AE KKG I K 

FAKDAASLESQLQDTQELLQEETRQKIiNLSSRIRQLEEEKNSLQ 

EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA ' 

KKKLLKDAEALSQRLEBKALAYDKLEXTKNRLQQELDDLTVDLD 

HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 

KETKALSLARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 

KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 

NMQAMKAQFERDLQTRDEQNEEKKRLLI KQVRELEAELEDERKQ 

RALAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 

DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 

ERARRHAEQERDELADEITNSASGKSALLDEKRRLEARIAQLEE 

ELEE EQSNMELLNDRFRKTTLQVDTLNAELAAERS AAQKSDNAR 

QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIGQLEEQLE 

QEAKERAAANKLVRRTE KKLKEI FMQ VEDERRHADQYKEQM E KA 

NARMKQLKRQLBBAEEEATRANASRRKLQRELDDATEANEGLSR 

EVSTLKNRLRRGG P I S FS S SRS GRRQLHLEGASLELSDDDTE S K 

TSDVNBTQPPQSE 




113 1 


6082 

( 
1 
I 


RGNCF W I VP FTMAQRTGLEDPER YL F VDRAVl YNPATQADWTAK 
KLVWIPSERHGFEAASIKEERGDE^WELAENGKKAMVNKDDIQ 
KMNPPKFSKVEDMAELTCLNEASVLKNLKDRYYSGLIYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISE3AYRCM 
LQDREDQS ILCTGESGAGKTENTKKVIQYLAHVASSHXGRKDHN 
IPGE\LERQLLQANP I LESFGNARTVQNDNSSRFGKFIRINFDV 
TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 

ksdlllegfnnyrflsngyipipgqXqdkgnfrgdpgeamhimg 

FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 

chllgmnvmeftrailtprikvgrdyvqkaqtkeqadfaveala 
katyerlfrwlvhrinkaldrtkrqgasfigildiagfeifeln 
sfeqlcinytneklqqlfnhtmfileqeeyqregiewnfidfgl 
dlqpcidlierpanppgvlalldeecwfpkatdktfveklvqeq 
gshskpqkprqlkdkadfci ihyagkvdykadewlmknmdplnd 

NVATLLHQS SDRFVAELW KDVDR I VGLDQ VTGMTETAFGSA YKT 
KKGM7RTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
LDPHLVIiDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRyErLTP 
NAIPKGFMDGKQACERMIRALELDPNLYRIGQSKIFFRAGVLAH 
LEEERDLKITDI II FFQAVCRGYLARKAFAKKQQQLSALKVLOR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMERKIIQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAKKQELEE I LHDLESRVBEEEERNOILQNEKKKMQAH I Q 
DLEEQLDEEEGARQKiQLEKVTAEAKIKKMEEEILLLEDQNSKF 
IKBKXLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 
RLKKEEXTRQELEKAKRKLDGETTDLnnr)T2iPr nanTriwT irr r\r 

ftKKEEEI^ALARGDDBTLHKNNALKVVRELQAQIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
2EVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEraTKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
2ELHAKVSEGDRLRVELAEKASKLQNELDNVSTLLEEAEKKGIK 
7 AKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 
SQQREEEEARKNLEKQVIALQSQLADTKiaCVDDDLGTlESLEEA 
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SEQ 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I«Isoleucine, KsLysine, 
L-Leucine, M»Mcthionine, N^Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T*Threonine , VsValine, 
WaTryptophan, Y=Tyrosine, X=Unknown , *=:Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KETKALSLARALEBALEAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATBDAKLRLEV 
NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELRAELEDERKQ 
RALAVAS KKKMEIDLKDLEAQI EAANKARDE VIKQLRKLQAQMK 
DYQREI*EEARASRDEIFAQSKESEKKLKSLEAEILQLQE3LASS 
ERARRHAEQEROELADE I TNSASGKS ALLDE KRRLEAR I AQLEE 
ELEEEQSNMELU^DRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKE LKAKLQEL EGAVKS KFKAT I S ALEAKIGQIiEEQLE 
QEAKERAAANKLiVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
KARMKQLKRQLEEAEEEATRANASRRKLQRELDDATBANSGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 




RHLEEICFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FGATLAVGLTI?VLSWTIIICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQY P P P Y PAQ PMGP PAYHETLAGGAAA P Y PAS QPP YNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSBV^SSGSSDAriM£>A$(3&Sb 
SDM PS RTRPKSPRKHNYRNESARESLCDSPHQNLSRPLIiENKLK 
AFS IGKMSTAKRTLS KKEQEELKKKEDEKAAAE I YEE FLAAFEG 
SDGNKVKTFVRGGWNAAKEEKETDEKRGK1YKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCGEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERAliKNIiNGKMI 
MSFEMIOjGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 
HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 
KLYS I LQGDS PTKWRTBD FRMFKNGS FWRPPPLNP YLHGMSE EQ 
ETEAFVEEPS KKGALKBEQRDKLBE ILRGLTPRKNDIGDAMVFC 
LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 
KVANAS YYRKFFETKLCQ I FSDLNAT YRTTQGHIiQSENFKQRVM 
TCFRAWE D WAI YPE PFL I KLQN I FLGL VNI I BEKETED VPDDLD 
GAPIEEELDGAPLEDVDGIPIDATPIDDLDGVPIKSLDDDLDGV 
PLDATE DS KKNE PI FKVAPS KWEAVDE SELEAQAVTTS KWEL FD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHKLYSNPIKEEMTE 
SKFS KYSEMSEEKRAKLRE I ELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSBRSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRECFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRktKMADKTPGG'SQkASSKTRSSDVHSSGSSDAHMDASGPSD 

SDMPSRTRPKSPRiOiNYRNESARESLCDSPHQNLSRPLLENKLK 

AFSIGKMSTAKRTIiSKKEQEELKKKBDEKAAAEIYEEFLAAFEG 

SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 

PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 

Q SERDE RH KTKGRL S R FEP PQ S DSDGQRR SMDAPSRRNRSSG VL 

DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 

LAS VKIMW PRTD E ERARERNCGFVA FMWRRDAE RAL KNLNGKM I 

MSFEMKLGWGKAVPTDDHDTYTDDCMMPtffr DDDDCrr crryj-n r\-n 
t u p aiinwn VNi v r i r r nr ± I iffbnflfirli Ijirrf l'otJijt j rNAQP 

RERLKNPNAP^PPPKNKEDFEKTLSQAIVKWIPTERNLLALI 

HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 

KLYS ILQGDS PTKWRTEDFRMFKNGSF WRP P PLNPYLHGMSE EQ 

ETEAFVEE PS KKGALKEEQRDKLEE ILRGLT PR KNO IGDAMVFC 

LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 

KVANAS YYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 

TCFRAWBDWAIYPBPFLIKLQNIFLGLVNIIEEKETED VPDDLD 

GAP I EEELDG APLED VDG I P I DAT P I DDLD G VP I KSLDDDLDG V 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, DsAspartic Acid, E=* 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y^Tyrosine, X=UnJcnown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDSKKNEPI FKVAPS KWBAVDES E LEAQAVTTS KWELFD 
QH2 ES E E E ENQNQEE E S EDEEDTQS S KS E EHHLYSNP I KEEMTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGCSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
D E CT FT R KE RKRRKSTS PS PS RSSSGRRVKS PS PKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5935 


3 


4493 


S Y W LS G WR LS R P PRQFWAGWRG I GR FGTMAP VHGDD CE I G AS AL 
SDSGSFVSSRARREKKSKKGRQEALERLKKAKAGERYKYEVEDF 
TGVYEEVDEEQ YSKI*VQARQDDDWI VDDDGIGYVEDGRE I FODD 
LEDDALDADEKGKDGKARNKDKRNVXKLAVTKPNNIKSMFTACA 
GKKTADKAVDLSKDGLLGDIIiQDLNTETPQITPPPVMILKKKRS 
IGAS PNP FS VHTATAVPS GK I AS PVS RKE PPLTPVPLKRAEFAG 
DDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWD 
KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQBGDSSFS 
VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGWFLF 
GKVWIESAETHVSCCVMVKNIERTLYFIjPREMKIDLNTGKETGT 
FISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKS 
E YLE VKYSAEM PQLPQD LKGETFSHVFGTNTS S L>ELFLMNRKI K 
GPCWLE VKKS TALNQP VS WCKVEAMAL KPDLVNVI KDVS P P P LV 
VMAFSMKTMQNAKNHQNE 1 1 AMAALVHHS FALDKAAPKPPFQSH 
FCWSKPKDCIFPYAFKEVIEKKNVKVEVAATERTLLGFFLAKV 
HKIDPDIIVGHNIYGFELEVLLQRINVCKAPHWSKIGRLKRSNM 
PKLGGRSGFGERNATCGRMICDVEISAKELIRCKSYHLSELVQQ 
I LKTE R W I PMEN I QNM YS ES SQLL YLLEHTW KD A\KF I IiQ I MC 
ELNVLPLALQ I TN I AGN I MS RTLMGGRSERNB FLLLHAFYENN Y 
IVPDKQIFRKPQQKLGDEDEEIDGDTNKYKKGRKKGAYAGGLVL 
DPKVGFYDKFILLLDFNSLYPSIIQEFNICFTTVQRVASEAQKV 
TEDGEQ3QIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 
LWPDLILQYDIRQKALKLTANSMYGCLGFSYSRFYAKPIiAALVT 
YKGRE I I»MHTKEMVQKMNLE VI YGDTDS IMI NTNS TNLEB VFKL 
GNKVKSEVNKLYKLLEIDIDGVFKSLLIiLKKKKYAALWEPTSD 
GNYVTKQELKGLD I VRRDWCDLAKDTGNF VI GQ I LS DQSRDT I V 
ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 
HVHVALW INS QGGRKVKAGDT VS YVI CQDGSNLTASQRAYAPEQ 
IiQKQDNLTIDTQYYLAQQ IHPWAR I CEPIDGI DAVLIATGWEL 
\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 
TCGTENI YDNVFDGSGTDME PSLYRCSNIDCKAS PLTFTVQLSN 
KLIMDIRRFIKKYYDGWLICEEPTCRNRTRHLPLQFSRTGPLCP 
ACMKATLQPEYSDKSLYTQLCFYRYIFDAECALEKLTTDHEKDK 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KJjFAGCAV 
KS 


5936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHRSRAWTCYLAI 
RMLMATCCPSPTTTACTGPWQRAPPLRLLVQKREADSSGLAFAS 
NSIiQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPLGFYIRDGMS VR VAPQG \ LERVPGI FI 
SRLVRGGLAESTGLLAVSDEI LE VNGI EVAGKTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSIjDD 
QEQASSGWGSRIRGDGSGFSL 


1 5937 


31 


1600 


PTS LLKSTVQLMCRL LQ DKRYQC VYS LAE I FKVLAS FY V I LV I L "~ 

VfiT.TCC VCT.WWMT BOOT Vf^vo W1\ t n pvphtvpht nrvirv\Trtmi r*-r 

ioujl &o * ouri wnjjKooLiisjy jo PB«ALiKaiwWxoUXPDVKjlUr AFI 
LHLADQYD P LYS KRFS I FLSE VS ENKLKQ I NLNNE WT VEKLKS K 
LVKNAQDKI ECiHLFMLNGLPDNVFELTEMEVLSLEIiI PE VKLPS 
AVSQLVNLKELRVYHSSLWDHPALAFLEENLKILRLKFTEMGK 
I PRWVFHL KNLKEL YLS GCVLPEQLSTMQLEG FQDLKNLRTLYL 
KS S LSR I PQ WTDLLP S LQKLS LDNE&S KLVVLNNLKKM VNLKS 
LELI SCDLERI PHS I FS LNNLHELDLRBNNLKTVEE IIS FQHLQ 
NLSCLKLWHNNIAYIPAQIGALSNLEQLSLDHKNIENLPLQLFL 
CTKLHYLDLS YNHLTF I PEE I Q YL\SNLQYPAVTNNN I EMLPDG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 

(A=Alanine CafVQfP^ Tif* n-Benarh^ lV<-»4«9 0 

Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=*Asparagine , 
P=Proline, Q=Glutamine, R«=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








L FQC KKLQ CLLLGKNS LMNL S PH VG ELS NLTHRE P I G \ N YLE TL " 
PPELEGCQSLKRNCLIVEENLLNTLPLPVTERLQTCLDKC 


" 5938 


395 


18^5 


YKGEGFFCNQEARGERRKKKKAMSSPN I WSTCSSVYSTPVFSQK ' 
MTVWILLLLSLYPGFTSQKSDDDYEDyASNKTWVLTPKVPEGDV 
TVIIiNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDI FFAQTWYDRRLKFNST IKVTjRLNSNMVGKT W T PnTwc-orvr 
SKKADAHWI TTPNRMLR I WNDGRVLYSLRLTI DAECQIiQLHNFP 
MDEHS CPLEFS S YG YPR EEI VYQ WKRSS VEVGDTRS WRL YQFS F 
VGLRNTTE WKTTSGDYWMSVYPDLSRRWGYFTIQTYI PCTLI 
WLSWVSPWINKDAVPARTSLG1TTVLTMTTLSTIARKSLPKVS 
YVTAMDLFVSVCFIFVFSALVEYG\TLHYPVSNRKPSICDKDKKK 
KNPAPTIDIRPRSATIQMNNATHLQERDEBYGYECLDGKDCASF 
FCCFE DCRTGAWRHGRIH I R I AKMD S YAR I FFPTAFCLFNLVYW 
VSYLYL 


5939 


66 


1404 


IRPGYLKEVQENS PGHRAGLEP F PDFI VS INGSRLNKDNDTLKD 
LLKANVEKPVXMLIYSSKTLELRETSVTPSNLWGGQGLLGVSIR 

FCSFDGANRNVWHVT.PVPCMCDIMVT unr DDUcnvTrnKnimnim 

SE0L FS L I E THEAKPL KL YV YNTDTDNCRE V 1 1 TPNSAWGGEGS 
LGCGIGYGYLIIRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTE 
VQLSSVNPPSLSPPGTTGIEQSLTGIiSISSTP\PAVSSVLSTGV 
PTVP \ LLP PQVNQSLTS VPPMESS YLHLPGLMP FTRQGLPNLPQ 
PSTFNLPR\ PTHfiWPnvrtT.vnT7T?WDr , X7T ddt ccmddd«it nrtv -r 

APLPIiPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS S LT VDVTP PTAKAPTT VEDRVGDS TPVSEKPVSAA 
VD AN AS ESP 


5940 


145 


717 


RRSASRSAS PRQSAGTAVTTGTRAGGTCIAAAHHRMRWRADGRS 
LEKLP VHMGLV I TE VEQE PS FSD1 AS LWWCMAVG I S Y I S VYDH 
QGIFKRNNSRI^DEILKQQQELLGLDCSKYSPEFANSNDKDDQV 
LNCHLAVKVLS PEDGKAD I VRAAQD FCQLVAQ KQKRPTDLD VDT 
LA\ VYLVQMWL I L 1 


5941 


13 


6147 


MCLGRMGASS PRS PE PVG PPAPGLP FC CGGSLLA VWLLAL P VA 

WGQCNA?EW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

SIICLKNSVWTGAKDRCRRKSCRNPPDPVNGKVHVIKaiQFGSQ 

IKYSCTKGYRLIGSSSATCIISGDTVIWDNETPICDRIPCGLPP 

TITNGDFISTNRENFHYGSWTYRCNPGSGGRKVFELVGEPSIY 

CTS NDDQ VGI WSGP APQC 1 1 PNKCTP PKVENG I L VS DNRSLFS L 

NBWEFRCQPGFVMKGPRRVKCQALNKWEPELPSCSRVCQPPPD 

VLHAERTQRDKDNFSPGQBVFYSCEPGYDLRGAASMRCTPQGDW 

S PAA PTCE VKS CDDFMGQLLNGR VLF P VNLQLGAKVDFVCDEG F 

QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 

KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VWSSPAPRCGILGHCQAPDHFLFAKLKTOTNASDFPIGTSLKYE 

CRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKTPPDPVNGMVH 

VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPI 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FELVGEPS I YCTSNDDQVGI WSGPAPQCI IPNKCTPPNVENGIL 

VSDNRSLFSLNEWEFRCQPGFVMKGPRRVKCQALNKWEPELPS 

CSRVCQP P PDVLHAERTQRDKDNFSPGQE VFYSCE PC YDLRGAA 

SMRCTPQGDWS PAAP TCEVKS CDDFMG QL LNGRVL FP VNLQLGA 

KVDFVCDEG FQL KGS S AS YCVLAGMESLWNSS VP VCEQ I FCPSP 

PVIPNGRHTGKPLEVFPFGKAVNYTCDPHPDRGTSFDLtGESTI 

RCTSDPQGNGVWSS PAPRCG I LGHCQAPDHFLFAKLKTQTNASD 

FPIGTSLKYECRPB Y YGRPFS ITCLDNLVWSS PKDVCKRKSCKT 

PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 

TAHWSTKPPICQRI PCGLPPTI ANGDFISTNR3NFHYGSWTYR 

CNLGSRGRKVFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 

TPPNVENG IL VSDNRS LFSLNE WE PRCQPG FVMKG PRRVKCQA 

LNKWEPBLPSCSRVCQPPPEILHGEHTPSHQDNFSPGOEV?YSC 

E PGYDLRGAASLHCTPQGDWS PEAPRCAVKS CDDFLGQL PHGR V 

LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSLWNNSVP 

VCEHIFCPNPPAILNGRHTGTPSGDIPYGKEISYTCDPHPDRGM 
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ID 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
cequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptfdV - 
1 ^-fccwiAMc, v.-v.yt)ueine ( u=A£partlC Acid, fia 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HcHistidine, I=Isoleucine, K= Lysine, 
LaLeucine, M=Methionine, NeAsparagine, 
P^Proline, Q»Glutamine f R»Arginine, 
S=Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










TFNLIGBSTIRCTSDPHGNGVWSSPAPRCElLiVRAGHCKTP'EQP 
P FAS PTI PI NDFEFPVGTSLNYECRPGYFGKMFS ISCLENLVWS 
SVEDNCRRKSCGPPPEPFNGMVHINTDTQFGSTVNYSCNEGFRL 
IGSPSTTCLVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSKN 
RTS FHNGTVVTYQCHTGPDGEQLFELVGERS IYCTS KDDQVGVW 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGS HTVQCQTNGRWG P KLPH CSRVCQ PP PE I LHG EHTL SHQ 

*^ * »r«w«vt ns\.Eiira i uuKVjrt/io L»rlw Jl irU&lJWo PEiAPRCTVKS 

CDDFLGOLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 
LAG MKALWNS S VP VCEQI FCPN PPAI LNGRHTGT PLGD I P YGKE 
VSYTCDPHPDRGMTFNLIGESTIRRTSEPHGNGVWSSPAPRCEL 
PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDPGYLLVGK 
GF I FCTDQG I WSQLDH YCKEVNCS FPLFMNG I S KELEMKKVYHY 
GDYVTLKCEDGYTLEGSPWSQCQAnDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFI LLI I FLSWI I LKHRKGNNAHENPKEVAI HLHSQ 
GGSSVHPRTLQTNEENSRVLP 




5942 
"5943 


4509 


588 


YLY\T&MRANPliAYGISHKAYQIDPPL\RKHREQ\LVIE\VGRKL 

DK\AQMIRFEERTGYFSSTDLGRTASHYYIKYNTIETFNELFDA 

HKTEGDI FAI VS KAEEFDQ I KVREEE I EELDTLLSNFCBLS TPG 

GVENS YGKIN I LLQTY I NRGEMDSFS L I S DS AYVAQNAAR I VRA 

LFE IALRKRWPTMT YRLLNLS KAIDKRLWG WAS PLRQFS I LP PH 

MLTRLEEKKLTVDKLKDMRKDEIGH1 LHH VNI GLKVKQCVHQI P 

S VMMEAFIQPITRTVLRVTLS I YADFTWNDQVHGTVGEPWWI WV 

EDPTNDHIYHSEYFLALKKQVISKEAQLLVFTIPIFEPLPSQYY 

IRAVSDRWLGAEAVCI INFQHLILPE RH P PHTELLDLQP LP ITA 

LGCKAYEALYNFSHFNPVQTQI FHTL YHTDCNVLLGAPTGSGXT 

VAAELAI FRVFNKYPTSKAVYIAPLKALVRERMDDWKVRIEEKL 

GKKVIELTGDVTPDMKSIAKADLIVTTPEKWCGVSRSWQNRNYV 

QQVTILI IDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRI VG 

LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 

HYCPRMASMNKPAFQAIRSHS PAKPVLI FVSSRRQTRLTALELI 

A?LATEEDPKQWLNMDEREMENIIATVRDSNLKLTLAFGIGMHH 

AGLHERDRKTVEELFVNCKVQV jl ATSTLAWGVNFPAHLVI I KG 

TEYYDGKTRRYVDFPirDVLQMMGRAGRPQFDDQGKAVILVHDI 

KKDFYKKFLYEPFPVESSLLGVLSDHLNAEXAGGTITSKQDALD 

YITWTYFFRRLIMNPSYYNLGDVSHDSVNKFLSHLIEKSLIELE 

LSYCIEIGEDNRSIEPLTYGRIASYYYLKHQTVKMFKDRLKPEC 
S TE ELLS T IjSDAF P YTT> r .mrDuxrc nu mmc tp t m//it nmnunitrin 
w *.uuijuiJi,jjaurtJio i li^i^VKHWiiUnlWSBjjftKCLPIESNPHSF 

DSPHTKAHLLLQAHLSRAMLPCPDYDTDTKTVLDQALRVCQAML 
DVAANQGWLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENHHL 
HLFKKWKPIMKGPHARGRTSTECLPELIHACGGKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVGrsVKGSWDDLVEGHNELSVST 
LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 
PR FPKS KDEGWFL I LGE VDKR ELIALKR VG Y I RNHHVAS LS F YT 
P E I PGRY I YTLYFMS DCYLGLDQQ YD/NLSQR YTSES FCTGQHQ 
GL 






1 


2274 

i 


DKPTRHKTYLSSSWAKPIAAAEGPVGDGELWQTWLPNliVVFLRLR' 

EGLKNQS PTE AEKPAS S S LP S S P P PQLLTRNWFGLGG ELFLWD 

GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 

LSPTQHHVALIGIKGLMVLELPKRWGKNSEFEGGKSTVNCSTTP 

VAERFFTSSTSLTLKHAAWYPSEILDPHWLLTSDNVIRIYSLR 

EPQTPTNVIILSEAEEESLVLNKGRAYXASLGETAVAFDFGPLA 

AVPKTLFGQNGKDEWAYPLYI LYENGETFLTY ISLLHS PGN / 1 

WKAVGS IAHAS \AAEDNYG YDACAVLCLPCVPNILVIATESGML 

YHCWLEGEEEDDHTSEKSWDSRIDLIPSLYVPECVBLELALKL 

AS GE DDPFDS DF5CP VKLHRDP KCP S RYHCTHEAG VHS VGLTW I 

HKLHKFLGSDEEDKDSLQELSTBQKCFVEHILCTKPLPCRQPAP 

IRGFWIVPDILGPTMICITSTYBCLIWPLLSTVHPASPPLLCTR 

ED VE VAES PLRVLAETP DS FEKH IRS I LQRS VANP AFL KAS EKD 

IAPPPEECLQLLSRATQVFREQYILKQDLAKEEIQRRVKLLCDQ 

cQ^KQLEDLSYCREERKSLREMABRIiADKYEEAKEKQEDIMNRI^K 



408 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D»Aspartic Acid, Es 
Glutamic. Acid, F= Phenylalanine, G=Glycine, 
H^Hietidine, I^Isoleucine, KoLysine, 
Aj-^ucme , w-Mecnionine; N^Asparagine , 
P-Proline, Q«Glutamine, R=Arginine, 
S»Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








KLLHSPHSELPVLSDSKRDMKKELQLIPDQLRHLGNAIKQVTMK 
KDYQQQKMEKVLSLPKPT I I LS AYQRKCIQS I LXEEGEH I REM V 
KQINDIRNHVNF 


5944 


167 


3428 


FS I ATFTDEPBVLTEPPS ATTTTTIG I SATWTTLAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
KKQPSVLVTFPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPSPLSSPNGKLTVASPKRGQKRBEGWKEWRRSKKVSVPSTVI 
SR V IGRGGCN I NA I R E FTG AH I DI DKQ KDKTGDR I I TI RGGTE 5 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTSLMG I KM TTVAIiS STS QTATALT VP AI S S AS THKTI KN P 
VN\NVRPGFPVSFP\LAYPPPQFAHALLAAQTFQQIRPPRLPMT 
HFGGTFPPAQSTWGPFPVRPLSPARATNSPKPHMVPRHSNQNSS 
GSQVNSAGSLTSSPTTTTSSSASTVPGTSTNGS PSS PSVRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLS TQS ACQNS VHPANKP I APNFSAPL PFG P FS TLFENS PT 
SAHAFVK3GSVVS SQSTPESMLSGKSS YLPNSDPLHQSDTSKAPG 
FR PPLQRPAPSPSG I VNMDS P YGS VT PS S THLGNFASNI SGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAGSVSS 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYGNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSL2K 
MVSSSTENNGPQTVWTGPWAPHMNSVHMNQLG 


594S 


1461 


197 


G VTHL FL FGKRKLRNG IAEDL KGQADFF FLL VS E A WATGS PRA 
WLTCL I LPI»PG 1 1 FS VLPKAMSR P LLI T FTPATDPS DLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAA 
QEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQG 
AAVS YLLGRGAA WG VCELSGRDAAQLAEEAG FPE VARMVRE SH 
GETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAKLLSLSQGP 
QPPNLPLGVP I SSPGFKLLLRGGWEPGMGLGPRGEGRANPI PTV 
LKRDQEGLG YRSAPQPRVTH FPAWDTRAVAGRE \TP PRVATLS W 
REERRREE \KDRAWERDLRTYMNLEF 


5946 


541 


X66l 


ILGSYSS IQPEEYS \SWC\EWI^QDLIiA\YVSPK\hSYLRDLP 
SEGS PQRVNS IDFV\ BL\ EHLQPDVLVHAVLR WDF /TI LTEAV 
YSYRGQKQKKVMLTVEQAQDQHYALVLWGPGAAW\YPQLQRKKG 
Y I WE FKYL FVQCNYTLENLELHTT P WS S CE CL FDDD I RA I T FKA 
KFQKS APS FVKI SDIATHLEDKCSGWL I KAQ I S ELAF P I TASQ 
KIALNAHSSltKS I FSS LPN I VYTGCAKCGLEIiETDENRI YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVI VPS S E I T YGMWADL FHS LLAVS AE P C VLK I QSL F VL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


5947 


3 


1317 


RG I PDRRRRGP IGRVNMDLENKVKKMGLGHEQGFGAPCLKCKEK 
CEGFELHFWRKICRNC\NVAKKSM/TVIiLSNEEDRKVGKIjF2DT 
KYTTL I AKLKSDG I PM YKRNVM I LTNP VAAKKNVS INT VTYE WA 
PPVQNQALARQYMQMLPKEKQPVAGSEGAQYRKKQLAKQLPAHD 
QDPSKCHELSPREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMN I PGGDRS T PAAVGAMEDKS AEHKRTQYSC YCCKLS MKEG 
DPAIYAERAGYDKLWHPACFVCSTCIIELLVDMIYFWKNEKLYCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKPVCKPCYVKNHAVVCQGCHNAI DPEVQRVTYN 
NFSWHASTECFLCSCCSKCLIGQKFMPVEGMVFCSVECKKRMS 


594B 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLrYAQAQRM 
VEIEIEGRLKRIS IFDPLEI ILEDDLTAQEMSBCNSNKENSERP 
PVCLRTiO^KNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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| SEQ 
ID 
j NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine. Cs(V<;hpino n_ a o v-*» « > ,> _j — 
' * v»y aLBlIie , U=u\GpB.TZlC AC Id , E*» 

Glutamic Acid, F=Phenylalanine, G«Glycine, 
H^Histidine, I=Isoleucine, K«Lysine, 
L^Leucine, M-Methionine, N=Asparagine, 
PoProline, Q-Glut amine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *sStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S P PS APRRP p V YYKF I E XSAJE ELDNEVE YDM DEED YAWLEIVNE 
KR.KGDCVPAVSQSMFEFLMDRFEKESHCENQKQGBQQSLIDEDA 
VCCICMDGECQNSNVTLFCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E\VGFAmVFIEPIIX3V1WIPPARWKLT\CNLCKEKGH/VGACI 
QCHKANC YTAFHVTCAQKAGLYM KME PVXELTGGGTT PS VRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAXKALAE PCAVLPTVCAP YI P PQRLNR IANQ VAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRECREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLI IDNCMKYNARDTV 

FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHT/2T.PFnT oct t r»MT rvr m/>»«itrrnn 

SRSKRAKLLKJCEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLI^PRKRSRSTCGDSBVEEBSPGKRLDAGL 
TWGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNA PKCGRGKPAIi VRRHTLEDRS E L I S CIENGNYAKAARIAAEV 
GQSSMW I STDAAAS VLBPLKVVWAKCS G YP S YPALI IDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMOTKSDEKLFLVLFFDNKRSWQ 

WLPKSKMVPLGIDETIDKLKMMEGRNSSIRKAVRIAFDRAMNHL 
SRVHGEPTSDLSDID 


|T949 

1 




3370 


YRERY P VSGSCiS VLRS ALE VCWDFLSGLTEGS LL PEG FFSGP I D Q 
GNHYQMRRKGRCHRGSAARHPSSPCSVXHSPTRETLTYAQAQRM 
VEIE I EGRLHR I S I FDPLEI I LEDDLTAQEMSECNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIV3Y 
S PPS APRR P PVYYKFI E KS AEELDNE VE YDMDEEDYAWLE I VNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCI CMDGECQNSNVI LFCDMCNLA VHQE CYG VP Y I PEGQ WLC / 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\VCALW\IP 
E\VGPANTVFIEPIDGVRNIPPARWKLT\CNLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAEPCAVLPTVCAP YIP PQRLNR IANQ VAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKE KLKYWQRLRHDLERARLL I ELLRKREKLKREQ VKVEQVA 
MELRLTPLTVLIiRS VLDQLQDKDPAR I FAQPVSLKEVPDYLDHI 
KHPMDF ATMR KRLBAQG YKNLHE FEED FDL 1 1 DNCM K YNARDTV 

FYRAAVRLRDQGGVVLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWBDVDRLLDPAN2AHT/3T PPOT ditt t nur tat mr.Aui/nn« 

v lsxvj_luu£' rU"4Jtt JjE. C»y JjKIiljJLjDMLtDLTCAMKSSG 

SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEBESPGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAP KCGRG KPALVRRHTLEDRS EL I SC I ENGNYAKAAR I AAE V 
GQSSMWISTDAAASVLEPLKVW/AKCSGYPS YPALI IDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTiCSDEKLFLVLFFDNKRSWQ 

WLPKSKMVPLGIDETIDKLKMMEGRNSS1RKAVRIAFDRAMNHL 
SRVHGEPTSDLSDID 


| 5950 
5951 


1166 


373 


ESRSLTMSTSQPGACPCQGAASRPAILYALLSSSLKAVPRPRSR 
CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 
DQRRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSiLKKILLEEP 
SSSGGSGQLPDR PQPSLAAVQWLQ CCLES FWSLELSPKE \ YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLXS IPTSLLGDLFFRPI IGDVDIAGLLGDMLLLR 




143 


*44* - - 

] 
1 

i; 


WNVKPSLLWQLFKFSDKEEHEQNDSISGKTGETGVEEMIATRK 
VEQDSKETVKLSHEDDHILEDAGSSDISSDAACTNPNKTENSLV 
3L PS C VD E VTE CNLELKDTMGI ADKTENTLE RNKI EPLG YCEDA 
SSNRQLESTBFNKS NLE WDTSTFG PE SN I LENAI CDVPDQNSK 
3LNAIESTK1ESHETANLQDDRNS0SSSVSYLESKSVKSKHTKP 
/I HS KQNMTTD A P KKI VAAKYE VIHS KTKVNVKS VKRNTD VPES 
2QNFHRP VKVRKKQIDKE PKIQSCNSGVKS VKNQAHSVLKKTLQ 
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Amino acid segment containing signal peptide' 
(A=Alanine, C=Cysteine", D=Aspartic Acid, E« 
Glutamic Acid, F*=Phenylalanine, G»Glycine, / 
H=Histidine. I^Isoleucine, K«Lyaine, 
L-Leucine, Methionine, N-Asparagine, 
P-Proline, Q=Glutamine, R*Arginine, 
S=Serine, T=Threonine, VcValine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, .*=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








dqtlvqifkplthslsdkshahpgcLkephhpaqtghvshssqk 

QCHKPQQQAPAMKTNSHVKBBLEHPGVEHFKEEDKLKLKKPEKN 
LQPRQRRS S KSFSLDEP PL F I PDNI AT I RREGSDHSS S FESKYM 
WTPS KQ CG F C KK PHGNRFM VG CGRCDDW FHGDC VGLS LSQAQQM 
GEEDKE YVCVKCCAE EDKKTE I LDPDTLENQATVE FHSGDKTME 
CE KLGLS KHTTNDRT K YI DDT VKHKVKI LKRESG EGRNS SDCRD 
NEIKKWQLAPLRKJMGQPVLPRRSSEEKSEKIPKESTTVTCTGEK " 
AS KPGTHE KQEMKKKKV \ E KGVLNVHPAASAS KPS ADQ I RQS VR 
HSLKDILMKRLTDSNLKVPEEKAAKVATKIEKELFSFFRDTDAK 
YKNKYRSLMFNLKDPKNNILFKKVLKGBVTPDHLIRMSPEELAS 
KELAAWRRRENRHTIEMIEKEQREVERRPITKITHKGEIEIESD 
APMKEQEAAMEIQEPAANKSLEKPEGSEK\RKEEVDSMSKDTTS 
QHRQHLFDLNCKICIGRMAPPVDDLSPKKVKVWGVARKHSDNE 
AESIADALSSTSNILASEFFEEEKQESPKST?SPAPRPEMPGTV 
EVESTF1ARLNFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 
PDS IQ VGGRI S PQTVWDYVEKI KASGT KE I CWRFTP VTEEDQ I 
S YTLLFAYFSSRKR YGVAANNNKQVKDMYL I PLGATDKI PHPLV 
P FDG PGLE LHRPNLLLGL 1 I RQKLKRQHS ACAS TSH I AETPES A 
PPIALPPDKKSKIEVSTEEAPEEENDFFNSFTTVLHKQRNKPQQ 
NLQEDLPTA VE PLME VTKQEPP KPLR FLPG VL I GWENQPTTLE L 
ANKPL P VDD I LQS LLGTTGQVYDQ\ AQS VMEQMTVKB I P FLNEQ 
TNSKI EKTDNVE VTDGENKE 1KVKVDNI SESTDKSAE I ETS WG 
SSS ISAGSLTS LS LRGKPPDVSTEAFLTNLS IQSKQEETVESKE 
KTLKRQLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 
LVANTARS PQ FI NLKRD PRQAAGRS Q PVTTS E S KDGD S CRNGEK 
HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 
PSVENIQTSQAEQAKPLQEDILMQNIETVHPFRRGSAVATSHFE 
VGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 
SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 
VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 
RRHSDPWGRQDQQQLDRPFKRGKGDRQRFYSDSHHLKRERHEKE 
WEQ ES E RHRRRDRS QDKDRDRKSRE EGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGEKDRDRYH 
KDRDHTDRTKSKR 


5952 


3226 


639 


PPARRSARDIiPRALSMEAARPSGSWNGALCRLL\LVTL\AFLIF " 
ASDACKNVTLHVPSKLDAEKLVGRVNIjKECFTAANLIHSSDPDF 
QI LEDGS VYTTNTI LLSS E KRS FT I LLSNTEK QE KKKI FVFLEH 
QTKVLKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 
QSDTAQJJYTIYYSIRGPGVDQEPRNLFYVERDTGNLYCTRPVDR 
EQYESFEIIAFATTPDGYTPSLPLPLI IKIEDENDNYPIFTEET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYSI IGQVPPS 

ptlfsmhpttgvitttssqldrelidkyqlkikVqdmdgqyfgl 

QTTSTCI I NIDDVNDHLPTFTRTS YVTS VEENTVD VE I LR VTVE 
DKDIjVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCWKPL 

nyebkqqmilqigvvneapfsreasprsamstatvtvnvedqde 

var-cv.«r xr xv^i. viuiUtnAAVul ioIMlji KAi UfETRSSSoIRYKKL 

tdptgwvtidentgsikvfrsldreaetikngiynitvlasdqg 

GRTCTGTLGI ILQDVNDNS PFI PKKTVI ICKPTMSS AEIVAVDP 
DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
wu* * vf a. l ytujftuunoDvi auuv iij\*U\*x IJiNIJCTJiKVDPRIGG 
GGVQLG KWAI LAI LLG I ALFFCI LFTLVCGASGTSKQPKVI PDD 
IJWNLIVSrn^APGDDKVYSANGFTTQTVGASAQGVCGTVGSG 
IKNGGQETI EMVKGGHQTS ES CRGAGHHHTLDSCRGGHTEVDNC 
RYTYSEWHSFTQPRLGEESIRGHTLIKN 


5953 


330 


811 


PLIOJPDPGWYWWVKQESEISKESQEMDARPKLDLGFKEGQTIK" 
LCIGNITNKKGGASKPRTARGGGLSLLPPPPGGKVTIPPPSS /V 
KLPSTNHVTPPS IPKSNHGGSDADILLDLDSPAPVTTPAPTPVS 
VSNDLWGDFS TAS S S VPNQAPQPSNWVQF 


5954 


32 


2130 


PPPPPPKLANMADLEAVLADVSYIJ^EKSKATPAAkASKRI^^ 

PEPSIRSVMQKYLAERNEITFDXIFNQKIGFLLFKDFCLNEINE 

AVPQVKFYEEIKBYEKLDNBEDRLCRSRQIYDAYIMKELLSCSH 
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Amino acid segment containing signal peptide"" 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=» Leucine, M=Methionine, N*Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








PFSKQAVEHVQSHLSKKQVTSTLFQPYIBEICESLR6DtFQKPM~ 
ESD X FTRFCQWKNVE LNI HLTMNE FS VHR 1 1 GRGG FGE VYGCRK 
AIDTGKMYAMKCLNKKRIKMKQGETIJVtjNERIMLSLVSTGDCPFI 

VCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSBKEMRFYA 
TE I ILGLEHMHNRFWYR t) X, V P AN T T »T Tiv w na d ToVm^r * 

FSKKKPHASVGTHGxWAPEVLQKGTAYDSSADWFSLGCMLFKLL 
RGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPELKSLLEGLL 
CRDVS KRLGCHGGGSQEVKEHS FFKGVDWQH V YLQKYPPPLI PP 
RGEVNAADAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERW 
QQE VTETVYEAVNADTDK I EAR KRAXNKQLGHE ED YALG KDCI M 
HGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TFKEAQRLLRRAPECFLNKPRSGTVELPKPSLCHRNSNGIi 


5955 


1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR * 
r tvxtvju vijcjvjW l NLtk? v JjQJjTKDPIjKTPGRLDHGTRTAFI hhreq 
VWKRCINI WRDVGLFG VLNE I ANS EEEVFE WVKTASGWALALCR 
WASSLHGSLFPHLSLRSEDLIAEFAQVTNWSSCCLRVFAWHPHT 
NKFAVALLDDSVRVYNASSTIVPSLKHRLQRNVASLAWKPLSAS 
VLAVACQSCILIWTLDPTSLSTRPSSGCAQVIiSHPGHTPVTSIjA 
WA PSG GRLLS AS PVDAAI R VWDVS TETCVPL PW FRGGG VTNLL W 
SPDGS KILATTPSAVFRVWEAQMWTCERWPTLSGRCQTGCWS pd 
GSRLL FTVLGE PL I YS LS F PERCGEGKG\ ALE VQSQQRLWQ I CL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


j 1705 


139 


gvgvrgaramatvqekaaalnlsalh^pahrppgfSvaqkpfgA 

TYVWSSIINTLQTQVEVKKRRHRLKRHNDCFVGSEAVDV1FSHL 

iqnkyfgdvdiprakwrvcqalmdykvfeavptkvfgkdkkpt 

v au£>&> ^&L«XKFTTI PNQDSQIiGKENKLYSPARYADALFKS SD I R 

sasledlwenlslkpansphvnisatlspqvinevwqeetigrl 

LQLVDLPLLDSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGILK 
AYSDSQEDEWLSAAIDCSEYLPDQMVVEISR5FPEQPDRTDLVK 
ELLFDAIGRYYSSREPLLNHLSDVHNGIAELLVNGKTEIALEAT 
QLLLKLLDFQNREEFRRLLYFMAVAANPSEFKLQKESDNRMWK 
RIFSKAIVDNKNLS:<GKTDLLVLFL\MDHQKDVFKIPGTL\HKI 
VS \ VK\ LMAIQNGRDPNRDAGYI YCQRIDQRDYSNNTEKTTKDE 
him* uu jv ± uutuutr iujo/unaK K.K \ DLG QFYKCHPD I F I E HFGD 


5957 


1479 


451 


ELQVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENIKNAMLIK *" 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDGTSLEFF 
SKKSDCSLFMFGSHNKKRPNNLVIGRMYDYHVLDMIELGIENFV 
S LKD I KNS KCPEGTKPML I FAGDDFDVTEDYRRLKS LLI D FFRG 
PWSNIRLAGIJBYVLHFTAIiNGKIYFRSYKLLLKKSGCRTPRIE 
LEEMGPSLDLVLRRTHLASDDLYKLSMKMPKALKPKXKKNISHD 
T?GTTYGRIHMQKQDLSKLQTRKM\KGLKKRPAERIT3DHEKKS 
KRIKKKLMELSQPLLFHCVLLKRI IKHQSIQSFL 


5958 


1 


3138 


AAALGMLLWFPACQAFNLDVEKLTVYSGPXGSYFGrAVDFHIPD" 
ARTAS VLVGAPKANTS Q PD I VEGGAVY Y CP W PAEGS AQCRQ I P F 
DTTNNRK I RVNGTKEP I E FKSNQWFG\ATVKA\HKGKSCGPVAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGQGY CQAGFS LDF YKNGDLI VGG PGS F YWQGQVI TASVAD 1 1 A 
NYSFKDI LRXLAGEKQTEVAPAS YDDS YLG YSVAAGEFTGDSQQ 
ELVAGIPRGAQNFGYVS XINSYDMTFIQNFTGEQMAS YFGYTW 
VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLYLQVSSLL 
FRDPQILTGTETFGRFGSAMAHLGDLNQDGYKDIAIGVPFAGKD 
QRGKVLIYNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SDIDKNDYPDLIVGAFGTGKVAVYRARPVVTVDAQLLLHPMX IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQS I ANTIVLMAEVQLD 
SLKQKGAIKRTLFLDNHQAHRVFPLVIKRQKSHQCQDFIVYLRD 
ETEFRDKLSPINISLNYSLDESTFKEGLEVKPILNYYRENIVSE 
QAHI LVDCGEDNLCVPDLKLSARPDKHQVI IGDENHLMLI INAR 
NEGEGAYEAELFVMIPEEADYVGIERNNKGFRPLSCEYKMENVT 
RMWCDLGNPMVSGTNYS LGLRFAVPRLEKTNMS INFDLQ I RSS 
N KDN PDSNFVSLQINITAVAQVE IRGVSHPPQI VLP IRNNEPEE 
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Amino acid segment containing signal peptide 
(A«Alanine. C»>Cvsteine D-ARn^rt-ir Afir? t?» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=»Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
WssTryptophan , Y«Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHXBEEVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEFL 
LY I FH IQTLG PLQCQ PNPNI NPQDI KPAAS PEDTPELS AFLRNS 
TI PHLVRKRDVHVVEFHRQSPAKILNCTNI ECLQISCAVGRLEG 
GES AVLKVRS RLWAHTPLQRKNDPYALASLVS FBVKKMPYTDQP 
AKLPEGS I AI KTS VI WATPNVS FS I PLWVI ILAILLGLLVLAI L 
TLALWKCGFFDRAR PPQEDMTDREQLTNDKTPEA 


~595S 


1 


1166 


GTSG YAAQQLPS LiLKERJB FH LGTLNKVFAS QWLNHRQ WCGTKC 
NTLFWDVQTS Q I TKI P I LKDREPGG VTQQGCG IHAI ELNPSRT 
LLATGGDNPNS LAI YRLPTLDPVCVGDDGHKDWI FS IAW ISDTM 
AVSGS RDGSMGLWEVTDDVLTKSDARHNVSRVP VYAHI THKALK 
DI P KEDTN PDNCKVRALA FNNKN KELGAVS LDG YFHLWKAENTL 
SKLLSTKLPYCRENVCLAYGSBWSVYAVGSQAHVSFLDPRQPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFL 
EERLS ACYGS KPRLAGENLKLTTG\ KGWLNHDETWRKYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLKGNYAGLWS 


5960 


2653 


870 


FVWS DGGPRPRRGPAVGAGAAHLSDP WAMT PGT ANRATN P LNKE 
LDWAS INGFCEQLNEDFEGPPLATRLLAHKIQS PQEWEAIQALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVKSDPKLPDDT 
TFPLPPPRPIOmFEDEEKSKMIJ«LLKSSIIPEDLRAANXLIKE 
KVQEDQKRMEKISKRVNAIEEVNNNVKLLTEMVMSHSQGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EALAEILQA 
HUAiuiyvAHu i JvyuvitoiSiS VWUDATAG5IPGSTS AliLDLSGLDL 
P PAGTT YPAM PTR PGEQAS PEQPS AS VSLLDDE LMS LGLS D P T P 
PSGPSLDGTGWNS FQSSDATEPPAPALAQAPSMESRP PAQTS LP 
ASSGLDDLDLLGKTLLQQSLPPESOQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LES IKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LSTAPQ P I HN I VFQS AVPKVMKVKLQ P PSGTE L PAFNP I VHP S A 

ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFAGVYRAES^IT^ 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQI IT 
GMLYLHSHG I LHRDLTLSNLLLTRNMNI KIADFGLATQLKMPHE 
KHYTLCGTPNY ISPEIATRSAHGLESDVWS LGCMFYTLL IGRP P 
FDTDTVKNTLNKWLADYEMPTFLSIEAKDLIHQLLRRNPAJDRL 
SLS S VLDHP FMS RNS STKS KDLGT VEDS I DSGHAT I STAI TAS S 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNS GRGR V I QDAE ER PH S RYLRRAYS SORSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEBRYSPTDNNANIF 
NFFKEKTS SSSGS FERPDNNQALSNIILCPGKTPFPFADPT PQTE 
TVQQWFGNLQINAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDASDNAHSVKQQNTMKYMTALHSKPEIIQQECVF 
GSDPLSEQSKTRGMSPPWGYQNRTLRSITSPLVAHRLKPIRQKT 
KKAWS I LDSEEVCVELVKE YASOEYVKEVLOI SSDGNfTTTT w 

PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
S RFVQL VRS KS P K I T YFTR YAKCIIMENS PGAD FE VW FYEG VK I 
HKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIKMYMDHANBGHR 
ICLALES I ISEEERKTRSAPFFPIIIGRKPGSTSSPKALSPPPS 
VDSNYPTRDRAS FNRMVMHSAASPTQAP ILNPSM VTN3GLGLTT 
TASG TD I S SNSL KDCL PKSAQ LL KSVF VKNVG WATQ \ LTSGAVW 
VQFNDGSQLWQAGVSSISYTSPNGQ\TTR\YGENBKLPDYIKQ 
KLQCLSS ILLMFSNPTPNFH 


5962 


20 


2447 


RVCSSSASTASQAVMADAWEE^tlUlLAAbFQRAQFAEATQRLSER 
NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GRVN I VDLQQ V I NVDL I H I ENR IGDI I KS EKHVQLVLGQLI DEN 
YLDRLAE E VNDKLQE S GQVT I SE LCKT YDLPGNFLTQALTQRLG 
RIISGHIDLDNRGVIFTEAFVARHKARIRGLFSAITRPTAVNSL 
ISKYGFQEQLLYSVLESLVNSGRLRGTWGGRQDKAVFVPDIYS 
RTQST WVDS FFRQNG YLEFDALSRLG I PDAVS Y I KKRYKTTQLL 
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Ammo acid segment containing signal peptide - 
|A*Alanine, C=CyBteine, D=Aspartic Acid, E* 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=*Histidine, I-lsoleucine, K=Lysine, 
L* Leucine, M»Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Se rine, T= Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








plkaacvgqglvdqveasveeaissgtwvdiapllptslsvbda ' 
aillqqvmrafskqastwfsdtvwsekfXindctelfrelmh 
qkaekemknnpvhli teedlkqi stlesvstskkdkkderrrka 
tegsgsmrgggggnareykikkvkkkgrkdddsddesqsshtgk 

KKPEISFMFQDEIEDFLRKHIQDAPEEFISELAEYLIKPLNKTY 
LBWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALT KHLL KS VCTD ITNLI FNFLAS DLMMAVDDPA 
AITSEIRKKILSKLSEETKVALTKLHNSLNEKSIEDFISCLDSA 
AEACD I MVKRGD K KRERQ I L FQHRQALAEQLKVTEDPAL I LHLT 
SVLLFQFSTHSMLHAPGRCVPQI IAFLNSKI PEDQHALLVKYQG 
LWKQLVSQSKKTGQGDYPLNNELDKSQEDVASTTRKELQELSS 
SI KDLVLKSRKSSVTEE 


5963 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAPGMP\GLMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSPGEPGY 
MGLPGIQGKKGDKGNQGEKGIQGQKGEN3RQGIPGQQGIQGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GPPGLDGKPGREFSEQFIRQVCTDVIRAQLPVliLQSGRIRNCDH 
CLSQHGS PGI PG PPGP IGPEGPRGLPGLPGRDGVPGLVGVPGRP 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
PPGDPGLPGKDGDHGKPGIQGQPGPPGICDPSLCFSVIARRDPF 
RKGPNY 


S964' 


3 


2147 


SCRTRGRLSP LQP RE AGS S RGSRARS EP PRFGGMEE ACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGIjYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGEVTFENVKEIFGQTIIHHHIPFNWDCEFIRLHFGHNR 

kkhlnyteftqflqelqleharqafalkdksksgmisgldfsdi 
mvtirshmltpfveenlvsaaggsishqvsfsyfnafnsllnnm 

ELVRKI YSTIAGTRKDAEVTKEEFAQSAIRYGQATPIjEIDI LYQ 
LADL YNASGRLTLAD I ER I APLAEGALP YNLAE LQRQQS PGLGR 

piwlqiaesayrftlgsvagavgatavypidlvktrmqnqrgsg 
swgelmyknsfdcfkkvlryegffglyrglipqligvapbkai 
kltvkdfvrdkftrrdgsvplpaevlaggcaggsqviftnplei 
vkirlqvageittgprvsalnvlrdlgifglykgakacflrdip 
fsaiyfpvyahcklllai)enghvgglnlla^amag\vpaaslv 

TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 

taarvfrsspofgWtlvtyellqrgfyidfgglkpagseptpk 
sriadlppanpdhiggyrlatatfagienkfglylpkfkspsva 
wqpkaavaatq 


5965 


1 


1498 


MVTWLYRFLPTSNMAAlOiRSLLPPDLRLQFWliHARLQKCFLSRG 
CGSYCAGAKASPLPGKMAMGLMCGRRBLLRLLQSGRRVHSVAGP 
SQWLGKPLTtRLLFPAAPCCCRPHYLFLAASGPRSLSTSAISFA 
EVQVQAPP WAATPS PTAVPEVASGETAD WQTAAEQS FAELGL 
GS YTPVGLIQNLLE FMHVDLGL P WWGAI AACT VFARCLI FPLIV 
TGQREAAR IUNHLP E I QKFSS R I REAKLAGDH IEY YKASS EMAL 
YQXKHGIKLYKPLILPVTQAPIFISFFIALREMANIiPVPSLQTG 
GLWWPQDLTVSDPIYILPLAVTATMWAVLELGAETGVQSSDLQW 
MRNVIRMMPL I TLP ITMHFPTAVFMYWLSSNLFSLVQVSCLR I p 
AVRTVLKI PQRWHDLDKLPPREGFLES FKKGWKNAEMTRQLRE 
REQRMRKQLELAARGPLROTFTHNPLLQPGKDNPPNIPSS\SSS 
SSKPKSKYPWHDTLG 




102 


1925 


RSKQVIWLTKRRQADTKMQHLWAAIEIIRNQKQIANIDRITK - 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQB 
G YWL PGDE I D WETENHD W YCFE CHL PGE VL I CDLCFRVYHS KCL 
S DE FRLRDS S S P WOCP VCRS I KTCKNTNKCiPMrzTVT .r ptvcdmvd 

RAIDLNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEG^IARMLYKDTCHEL\DELQLC 
KNCFYLANARPDNWFC YPC I PNHELDWAKMKGFGFWPAKVMQXE 
DNQVDVRF FGHHHQRAW I PS EN I QD I T VN I HRLHVKRSMG WKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKXl^ASSPRMLHRSTOTTNIXSVCQSMCHDKYTKIFNDF 



414 
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f ID 
| NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(AsAlanine, CoCysteine, DsAsoartic Acid p- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *»stop 
Codon, /=possible nucleotide deletion, 
\epossible nucleotide insertion) 


5967 






KDRMKSDHIOiKTERVVREALKICbRSEMEEBiaQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTXKKQWCYNC 
EEEAMYHCCWNTSYCS IKCQQEHWHAEHKRTCRRKR 


5$68 


102 


1925 


RS KQ VMARLTKKRQADTKAIQHLWAAI Ell RNQKQI ANI DR I TK 
YMS R VHGMH PKETTRQLS LAVKDG LI VETLTVGCKGS KAG I EQ E 
G YWLPGD E I DWETENHD W YC FECHL PG EVL I CDLCPR VYHS KCL 
S DE FRLRD SSS P WQCP VCRS I KKXNTN KQ EWGT YLRF I VS RMKE 
RA I DIiNKKGKDKKH PMYRRLVHS AVDVP T I QE KVNEGKYRS YEE 
FKADAQLLLHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRPWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKBKCKEEFVEEIKKIATQHKOLISQTKKKQWCYNC 
EEEAM YHCC WNTS YCS I KCQQEHWHAEHKRTCR R KR 


5969 


81 


1288 


VRFPRRGGAPPTVLTPGRQQGVFLGPQRPdSEPDIPARGQPHPP 
RPVGVSTSAQAQVQPPAMHRRRLALGLGFCLLACTSLSVliWVYL 
ENWLPVSYVPYYLPCPEIFNMKLHYKREKPLQPWMSQYPQPKL 
LEHR PTQLLTLTPWLAP I VS EGT FNP E LLQH 1 YQPLNLT I G VTV 
FAVGN/HFLESAEE FFMRGYRVH YYI FTDNPAAVPGVPLGPHRL 
LSS I P I QGHSH WEETSMRRMETISQH I AKRAKRE VD YL FCLD VD 
MVFRNPWGPETLGDLVAAIHPSYYAVPRQQFPYERRRVSTAFVA 
DS EG D FYYGGAVFGGQ VAR VYE FTRGCHMA I LAD KANG 1 MAAWR 

ESSHLNRKFISNKPSfCVLSPEYLWDDRKPQPPSLKLIRFSTLDK 
DISCLRS 




1126 


503 


UVGFNIKRKRCULDVFLESPRKPSGRRDRAPEKQRRIAANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIWLFGISITGGLFYTI 
FKBLFSSSSPSKIYGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKOGTVYAQVKENP 
GSGEYDFRYIFVEIESYPRRTIIIEDNRSQDD 




316 


4712 

i 

I 
I 
C 
I 
I 


SQDNIGHRLl^WiGWKLGCX3IX5i^LO^RTDPIPIwkYDVMGMG 

RMEMELDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 

KALEDLRANF YCEL CD KQ YQKHQE FDNH I NS YDHAHKQRLKDLK 

QREFARNVSS RSRKDE KKQE KALRRLHELAEQRKQAECAPGSGP 

M FKPTTVA VDE EGG EDDKDE S ATNS GTG ATAS CGLGS E FS TD KG 

GP FTAVQ I TNTTGLAQAPGLAS QG I S FG I KNNLGTPLQKLG VS F 

SFA1CKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 

GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEGAGATE 

PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 

KKGSS PKPKSCI KAAASQGAEKTVSE VSEQPKETSMTEPSEPGS 

KAEAKKALGGDVSDQSLESHSQKVSETQMCESNSSKETSLATPA 

GKESQEGPKHPTGPFFPVLSKDESTALQWPSBLLIFTKAEPSIS 

YSCNPLYFDFKLSRNKDARTKGTEKPKD1GSSSKDHLQGLDPGE 

PNKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 

TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 

SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 

PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 

KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 

KS PS Q YS EEE EEEDSGS EHS RSRSRSGRRHSSHRS SRRS YS S SS 

DAS S DQS CYS RQRS YSDDS YS DYS DRS RRHS KRS HDS DDS D YAS 

SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 

^av.&K^Ki>KKK5RSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 

=CRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 

3DGRGDDS KATG P PS QNSN IGTGRG SE GD CS PEDKNS VTAKLL L 

:kiqsrkverkpsvseevqatpnkagpklkdppqgyfgpklpps 

jGNKPVLPLIGKLPATRKPNKKCEESGL3RGEBQEQSETEBGP d 

;ssdalfghqfp\seettgplldpppeesksgbvtadhpvaplg 
^pahfdcylgdptishnylpdpsdgntlesldsssqpgpvessl 
,piapdi,ehfpsyappsgdpsiestdgaeda\slaplesqpitf 
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SEQ 
ID 
NO: 



5971 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



53 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=- Phenylalanine, G«=Glycine, 
H=Histidine, I=Isoleucine, K=» Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, lUArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 



"2149 



TPEEMEKYSKLQQAAQQHIQQO LLAKQVKAFPASAAXAPATPAL 
QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGTHPHPHPQ 
PLAQVHHIPQPHLTPISLSHLTHSI I PGHPATPLASHPIHI IPA 

SAIHPGPPTFHPVPHAALYPTLIAPRPAAAAATALHLHPLLHPI 
FSGQDLQHPPSHGT 



syLYFVGVDMDNPIGNWDGRFDGVQLCS PACVESTlLIiHINDII 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RSEL PYTLNGSS VDS QPQS KSKNTW Y I DE VAEDPAKS LTE I S TD 
FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQS VMEELNTAP VQES P PLAMPPGNSHGLEVGS LAE VKENP 
PFYGVIRWIGQPPGLNEVLAGLELEDECAG\CTDGTF/REGTRY 
FTCALKKALFVKLKS CRPDSRFAS LQPVSNQ I ERCNSLAI WEAY 
LSEWEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTLF 
CLFAFSSVLDTVLLRPKEKNDVEYYSETQELLRTEIVNPLR1YG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHILRV 
BPLLKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 
SNLKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGLAMYECRECYDDPniSAGKIKQFCKTCNTQVHLHP 
KRLNHKYNP VSLPKDLPDWDWRHGC I PCQNMEL FA VLC I ETSH Y 
VAFVJCYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYL 
KMS LEDLH S LPS RR 1 QG CARRLLCDAI YVPCTQS PTMS LY K 



5973 



I LLAGS PS P RDQ CSQRQS SGGDKEL VT RG CTFSTAWS PS AMTQ 
EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PSA I VS FTVS RRNANV I PNFQI LF VS TFAVTTTCL I WFG CKLVL 
NPSAININFNLILLLLLELLMAATVIIAARSSEEDCKKKKGSMS 
DS AN I LDEVP F PARVLXS YS WB VI AGI S AVLGG 1 1ALNVDDS V 

SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 

TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLLLLLV 

LLLQA/GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 

GQEPPEGVRQGESLE5RRGAMGPVTPRRGNRVAAPSLAPGMETH 
NP 



"4293 



2007 *~" | NGDG KDL FGH I WAWRSNG 1 1 SNFRRS PHAGMAEDE PDAKS P KTG* 
GRAPPGGAE AGEPTTLLQRLRGT I S KAVQNKVEGI LQDVQKFSD 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKIIREIFPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
EVTPAPRDELVEAACALTCDWAERILKRSFSSIVEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERLAQPPKDLEARTGAGPLARGERKKSWESSAPGANNLQV 
NALVARLPLLLPRAPRSLI PP I PVSPPILAPRLSSGALKVATLP 
LSSRAGAPPAAVP I INM I LPTVPALPGPG PGPGRAPPGGLTQ PR 
GTENRE VG IGGDQG PHDKG VKRTAEVP VS EASGQAPPAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRIj 

pwetwgsggegnsaggaerpgpmgeaekgavlaqg\qgdgtvsk 
ggrgpgsomtkeaedkiplvpskvsvikgsrsqkeafplakgev 
dtapqgnkdlkehvlqsslsqehkdpkatpp 



2200 



LGLQMHTTSG K I HQAM VTS IjNED^ES VTVE Wl ENG DTKGK \ B I D 

lesifslnp\dl\vpdgeiepsp\etppppassakvnkivknrr 
tv\asikndpps\rdnrwgsararpsqfpeqfssaqqngsv\s 
dispvqaakxefgppsrrksncvkeveklqekrekrrlqqqelr 
ekraqdvdatnpnyeimcmirdfrgsldyrplttadpidehric 
vcvrkrplnkketqmkdldviti pskewmvhepkqkvdltryl 
enqtfrfdyafddsapnemvy rftarplvet i fergmatcfayg 
qtgs gfcthtmggdfs gknqdcs kg i yalaard vflmlkkpnykk 

LELQVYATFFE I YSGKVFDIJiNRKTKLRVLEDGKQQVQVVGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MASCENTLKTIjRYANRVKELIVDPTAAGT!)VRPIMHHPPNQI\DD 
LBTQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanlne. C=Cvsteine Doiflnartie & r -i^ r>_ 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine f Ielsoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codan, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








EEQWBDHRAVPQESIRWLEDEKALLEMTEBVDYDVDSYATQLE 
AI LEQKI D I LTELRDKVKS FRAALQE EEQASKQ I NPKR PRAL 


5975 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID" 
LESIFS LNP\ DL \ VPDGB I E PS P \ET PPP PASSAKVNK I VKNRR 
TV\ AS I KNDPPS \ RDNRWGSARARPSQFPEQFSS AQQNGS V\ S 
D I S P VQAAKKEFG P PS RRKSNCVKEVEKLQE KREKRRliQQQELR 
EKRAQDVDATNPNYEI MCMI RD FRGSLDYRPLTTADP I DEHRI C 
VCVRKRPIjNKKETQMKDLD V I T I PS KDWMVHE PKQKVDLTRYL 
ENQTFRFD YA FDDSAPNEMV YR FTARPLVETI FERGMATCFA YG 
qtg5gkthtmggd fs g knqdcs kg i yalaard vflmlkkpn ykk 
LELQVYATFFEIYSGKVFDIiTiNRKTKLRVLEDGKQQVQVVGIjQE 
RE VKCVED VLKL I D I GNSCRTSGQTS ANAHS S RSKAVFQ I ILRR 
KGKLHGKFS LIDIAGNERGADTS sadrqtrlbgae inksllalk 
EC I RALGRNKPHTPFRASKLTQ VLRDS FI GENSRTCM I ATI S PG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRP 1MHHPPNQI \DD 
LETQWGVGSSPQRDDbKIjIjCEQNEEEVSPQLFTFHEAVSQMVEM 
EEQWEDHRAVFQES I RWLEDEKALLE MTE EVDYD VDS YATQLE 
AILEQKIDZLTELRDKVKSFRAALQEEEQASKQ1NPKRPRAL 


"5976 


20 


2949 


VHHLHbTRVSVWNLDIILRIAQQMGXKTLNLVLG\LKRA\LEF 
PE VSWME V KD PNMKGAMLTNTGKYAI PTIDA\EAYAIGKKEKPP 
FLPEEPSSSSEEDDPIPDELLCLICKDIMTDAWIPCCGNSYCD 
BCIRTALLESDEHTCPTCHQNDVSPDALIAilKFIiRQAVNNFKNE 
TGYTKRLRKQLPSPPPPIPPPRPLIQRNLQPLMRSPISRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 
ATVS ISVHSEKSDGPFRDSDNKILPAAALASEHS KGTSS IAITA 
LMEEKG YQ VP VLGTPS LLGQSLLHGQIil PTTGP VRINTARPGGG 
R PG W EH SN KLG YL VS P PQQI RRGERS CYRS 1 NRGRHHS ERSQRT 
CGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFP PAPANLSTPWVSSGVQTAHSNTI PTTQ 
APPLSREE FYREQRRLKEEEKKKS KLDEFTNDFAKELMEYKKIQ 
KERRRS FSRS KS P YSG S S YSRS S YT YS KSRSGS TRSRS YSRS FS 
RSHSRSYSRSPPYPRRGRGKSRNYRSRSRSHGYHRSRSRSPPYR 
RYHSRSRSPOAFRGOS PNKRNVPOGRTFRFYPNWVRrvdtjdvtim 
KAY YGR S VD FRDP FEKER YREWERKYR EWYEKY YKG YAAGAQPR 
PS ANRENFS P E R FT >PLNI RNS P FTRGRRED YVGGQ SHRSRN I GS 
NYPEKLSARIX3HNQKDNTKSKEKESENAPGDGKGNKHKKHRKRR 
KGEESEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDEPMDAES I T FKS VS E KDKRERDKP KAKGDKTKRKNDGSAV 
SKKENIVKPAKGPQEKVDG\DVRDLLDLNL\QLKKPKEETPKDL 
TILNHHLPLRRMKKSL \EP P\ EKLTIiNQQK\TPRNKTSQRGKSB 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHS INHILHPGAGVAAGPATGW/REYL'r 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAI IEEDDGDGGWV 
DTYHNTGITGITEAVKEITLENKDNIRLQDCSALCEEEEDEDEG 
EAADMEEYEESGLLETDEATLDTRKIVEACKAKTDAGGBDAIIiQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVTIENHPHLPPPPMCSVHPCRHAEVMKKIIETVAEGGGEL 
GVHMYLLI FLKFVQAVI PTI E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQS PLTWAPGFYRRFDLATSGRRLRGQTAEPAGRQ " 
R PRRE PE AMDEQS VES IAE VFRCFI CME KLRDARLCPHC3KLCC 
FSCIRRWLTEQRAQCPHC?RAPLQLRELVNCRWABEVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMK 
GGHTFKPLAEI YBQHVTKVNEEVAKLRRRLME LI SLVQEVERNV 
EAVRNAKDERVRKIRNAVEMMIARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSBLISKSSEILMMFQQVHRKPM 
ASFVTTP VP PDFTS E LVPS YDS ATFVIiENFSTLRQRAD P VYS PP 
I*Q VSGL CWRLKVYPDGNG WRG YYLS VFLELS AGL PETS K YE YR 
VEMVHQSCNDPTKNI IREFASDFEVGECWGYNRFFRLDLLANBG 
YLN PQNDTVI LR FQ VRS PTF FQKS RDQHWY I TQLEAAQTS Y IQQ 
INNIiKERLTI ELSRTQKSRDLS P PDNHLS PQNDDALETRAKKS A 
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1 Predicted 
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corresponding 
1 to first 
1 amino acid 
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Predicted end 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C*Cysteine, D=Aspartic Acid, Eo 
Glutamic Acid, F»Phenylalanine, G~Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, MsMethionine, N=Asparagine, 
P*»Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) | 








CSDMLLER \GP YSAS \VREAKEDEEDEEKIQNEDYHKEiSDGDL~| 

DLDLVYEDEVNQLDGSSSSASSTATSNTEENDIDEETMSGENDV 

EYKNMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 

ATSSLLDIDPLILIHLLDLKDRSSIENLWGLQPRPPASLLQPTA 

SYSRKDKDQRKQQAMWRVPSDLKMLKRLKTQMAEVRCMKTDVKN 

TLSEIKSSSAASGDMQ7SLFSADQAAIAACGTENSGRLQDU5ME 

LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 

NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 

DR Q C KALD S D A WVA V FS G LPA VE KRRKMVT LG AN AKGGH LEGL 

QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 

QEEHTSVGGFHDSFMVMTQPPDEDTHSSFPDGEOIGPEDLSFNT 
EBNSGR 


\ 5979 


j 212 


3665 


LPDm'MYbWLKLI^FGFAFLI^EVFVTGO^PTPSPTDAYI^A^EH 
TTTLSPSGSAVISTTTIATTPSKPTCDEKYANITVDYLYNKETK 
LFTAKLNVNENVECGNNTCTNNEVHNLTECKNASVS ISHNSCTA 
PDKTL I LD VP P G VEKV P VHCCS \Q VEQ PDS T I ML KWKN IETS TC 
DTQNITYRFQCGNMIFDNKEIKLENLEPEHBYKCDSEILYNSHK 
FTNASKIIKTDFGSPGEPQIIFCRSEAAHQGVITWNPPQRSFHN 
FTLC YI KETE KDCLNLD KNLI KYDLQNL KP YTKYVLS LHA Y 1 1 A 
KVQRNGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFILHHSTSYNSKALIAFLAFLIIVTSIALLW 
LYKI YD LHKKRS CNLDEQQELVE RDDE KQLMNVE P I HAD I LLET 
YKRKI ADEGRLFLAEFQS I PRVFS KFP I KEARKP FNQNKNR YVD 
ILPYDYNRVEXjSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DE T VDDFWRMI WEQKATVI VM VTR CEEGNRNKCAE YWPSMEEGT 

rafgeccckdltkhkrcp\dyiiqklnivnkkekatgrevthiq 
ftswpdhgvpedphlllklrrrvnafsnffsgpiwhcsagvgr 
tgtyigidamlegleaenkvdvygywklrrqrclmvqveaqyi 

LIHQAIiVEYNQFGETEVNLS ELHP YLHNMKKRDPPSEPS PLEAE 
FQRLPSYRSWRTQHIGNQE\ENKSKNRNSNVIPYDYNRVPLKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 
AAQG PLKETIGDF WQM I FQ R KVKVI VMLTELKHGDQE I CAQ YWG 
EGKQTYGDIE VDLKDTD KS S T YTLRVFE LRHSKRKDSRTVYQ V Q 

ytnwsveqlpaepkelismiqwkoklpqknssegnkhhkstpl 

L IHCRDGSQQTG I FCALLN LLE S AETE E WDI FQWKALRKARP 
GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 
KVKQDANCVNPLGAPEKLPEAKEQAEG S EPTSGTEG PEHSVNG P 
ASPALNQGS 


5980 


3 


2363 


DAWGCKLRRLRFTYGTQTRVSLALPGQYELVHTLVAHOX^NWETI 
PEEDLEVQE^EDAAHDLTELEVTMHHALI^EVDVVVAPCQGLR 
PTVDVLGDLVNDFLPVITYALHKDELSERDEQELQE I RKYFS FP 
VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 
CGAPGQDTKAQSMLVEQSEKLRHJbSTFSHQVLQTRLVDAAKALN 
LVHCHCLD I F IKQA FDMQRDLQ I TP KRLE YTRKKENE L YESLMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTRE I KCCIRQ IQELI I SRLNQAVANKLI SS VD YLRES FVGTL 
ERCLQ S LEKSQD VS VH I TSNYLKQ I LNAA YHVE VTFHSGSSVTR 
MLWEQIKQIIQRITWVSPPAITLEWKRKVAQEAIESLSASKLAK 
S I C S Q FRTRLNS SHEAFAASLRQLBAGHSG RLE KTBDLWLR VRK 
DHAPRLARLS LESRS LQDVLLHRKP KLGQ E LGRGQ YG WYLCDN 
WGGHFPCALXSVVPPDEKHWNDLALEFHYMRSLPKHERLVDLKG 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 

MMSGS IVGTPIHMAPELFTGKYDNS VDVYAFGILFWYICSGSVK 
LPEAFERCASKDHLVJNNVRRGARPERLPVFDEECWQLMEACWDG 
DPLKRPLLG I VQPMLQG IMNRLCKS \NSEQPNRGLDDST 


5981 


1 


2519 

< 


GRKHSAAMERPWGAAnGLSRWPHGI^IXtLLQLLPPSTL§QDRT| 
DAPP P PAAPL PRWSGP I G VS WGLRAAAA \GGAFPRGGRWRRS AP 
3\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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j SBQ 
ZD 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(AcAlanine, C»Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G-Glycine, 
H«Histidinc, I»Isoleucine, K=Lysine, 
L-Leucine, M-Methionine, N*=Asparagine, ! 
P=Proline, Q-Glutamine, R=Arginine, ! 
SaSerine, T«Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, ! 
\«possible nucleotide insertion) j 








tefgmaigpeksgk^a^taevsggsrggriprssdfaxnfvqtdH 

LPPHPLTQMMYSPQNSDYLLALSTENGLWVSKNFGGKWEEIHKA 
VCLAKWGSDNTIPFTTYANGSCKADLGALELWRTSDJW3KSFKTI 
GVKIYSFGU3GRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
SVGQEQFYS ILAANDDMVFMHVDEPGDTGPGTI FTSDDRG IVYS 
KSLDRHLYTTTGGETDFTNVTSLRGVYITSVLSEDNSIQTMITF 
DQGGRWTHLRKPBNSECDATAKNKNECSLHIHASYSISQKLNVP 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EGPHYYTILDSGGIIVAIEHSSRPlNVIKFSTDEGQCWQrYTFT 
RDPIYFTGLASEPGARSMNISIWGFTESFLTSQWVSYTIDFKDI 
LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFrjRLRKSSVC 
QNGRDYWTKQPS ICLCSLEDFLCDFGYYRPENDSKCVEQPELK 
GHDLEFCLYGREEHLTTNGYRKI PGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQNSKSNSVPIILAIVGLMLVTWAGVLIVKKYVC 

GGRFLVHLYSVLQQH\AEA\NGVDGVDALDTASHTNK5GYHDDS 
DEDLLE 


5982 


j $6 


2316 


ATRPPRGSSWCRQFSRTASAAPGRSNMLRIPVRKALVGLSKSPkH 
GCVRTTATAASNL1 EVFVDGQS VMVEPGTTVLQACEKVGMQI PR 
FC YHE RLS VAGNCRMCLVE I EKAP KWAACAMP VM KG WNI LTNS 
EKSKKAREGVMEFLUANHPLDCPICDQGGECDLQDQSMMFGNDR 
S R FLEGKRAVEDKNIG PLVKT I MTRCIQCTRC I R FAS EI AG VDD 
LG TTGRGNDMQ VGTY I EKMFMS E LSGN 1 1 D I C P VGALTS KP YAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
EEWISDKTRFAYDGLKRQRLTEPMVRNEKGLLTYTSWEDAIiSRV 
AGMLQSFQGKDVAAIAGGLVDAEALVALKDIiLNRVDSDTLCTEB 
VFPTAGAGTDLRSNYLUgTTIAGVEEADWLLVGTNPRKEAPLF 
NARIRKS WLHNDLKVALIGSPVDLTYT YDHLGDS PKILQDIASG 
SHP FS Q VLKE AXKPM WLGSS ALQRND3 AA I LAAVS S IAQKIRM 
TSGVTGDWKV^ILHRIASQVAALDLGYKPGVEAIRKNPPKVLF 
LU5ADGGCITRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 
EKS AT YVNTEGRAQQTKVAVTPPGLAREDWKI I RALSE I AGMTL 
PYDTL\DQVRNRLEEVSPNLVRYDDIEG\ANYFQQANELSKLVN 
O^LLADPLVPpQLTMKDFYMTDSlSRASQTMAKCVKAVTEGAQA 


5983 
5984 


248 


1763 


EARGDGGRRRiQ^GRRAGRGEP\AGLKSQGQRAVPKRAVARGGH 
RQ\ YS AAI ALLE P AGS B IADDLS I L YSNRAACYLKEGNCS GCI Q 
DCNRAL3LHPFSMKPLLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GLQLANDSVNRLSRILMELDGPNWREKLSLIPAVPASVPLQAWH 
PAKEMISKQAGDSSSHRQQGITDEKTFXALKEEGNQCVNDKNYK 
DALS KY S BCLKINNKECAI YTNRALCYLKLCQFEEAKQDCDQAL 
QLADGNVKAFYRRALAHKGLKNYQKSLIDLNKVILLDPSI IEAK 
MELEEVTRLLNLKDKTAPFNKEKERRKIEIQEVNEGKBEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIIiTAL 
STRKDKEACAHLLAITAPKDLPMFXSNKLEGDTFLLLIQSLKNN 
L IEKD PS L VYQHLL YLS KAER FKMMLTL I S KGQKELI EQLFEDL 
SDTPNNH FTLE D I QALKRQYEL | 


5985 


755 


1193 


SSVCMACTWSNMKXQRSVSFI^ 

TGDVGRRICRLLVGLFTKGDTSSKRVHPFSPGPCFLLCDLARVG 

SSPKINVSPFYQN\QTSTQRSCTVFVWQRCSLVGPFQVTVFTMY 
FHHSLRS 1 S RFSSG 




22 


1408 

1 


KKVARPGTAEPAKARRWRRGI^RDIAGAERKAGVSERGDSGXl 
RRPNPSIPSAAAGMSHIQIPPGLTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPAS VLPAATPRQSLGHPPPEPG PDRVAJDAK 
^UokoebDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HPKTDEQRCRKJEACKDI LL FKMLDQEQLS QVLDAM FER I VKAD 
EHVIDQGDDGDNFYVIERGTYDILVTKDNQTRSVGQYDNRGSFG 
ELALM YNTPRAAT I VATS EG S LWGLDR VT FRRI I VKNNAKKR KM 

FESFIESVPLLKSLEVSERMKIVDVIGBKIYKR/DGERIITQGE 
K\ADSFYI I SSGE VS I LIRS RTKSNKDGGNQE VE IARCHKGQ YF 
3 ELALVTNKPRAASAYAVGD VKCLVMDVQAFERLLGP CM D I MKR 
^ISHYEBQLVKMFGSSVDLGNLGQ j 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" - 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E- 
uiucaraig ACia < F=Pnenylalanine, G=Glycine, 
H«Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 


484 


DAWKSTSLTPHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
S PCCRFDS P RGPP PPRLGLLGALMAEDG VRGS P P V PSG PPM EED 
GLRWTPKS PLDPDSGLLSCTL PNGFGGQ S G P EG ER S LAP PDAS I 
iji5«vt.&i^DMVAQEI.FQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVOSILDEFLQT\YGSLIPLSrDEWEKLEDIFQQEFSTP 
SRKGLVLQL I QS YQRM PGNAMVRG FRVA YKRHVLTMDDLGTL YG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVDI FNKELLL I P IHLE VH WSL I S VDVRRRTI TYFDS Q 
RTLNRRCP KH I AKYLQAEAV KKDRLD FHQGWKG YFKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
S P CCRFDS PRG P P PPRLGLLGALMAEDGVRGS P P VPSG P PME E D 
GLRWTPKS PLD PDSGLLS CTL PNG PGGQS G PEG ERS LAP PDAS I 
LI SNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\ AGQHS PLRE 
EHVTCVQSILDEFLQT\YGSLIPLSTDEWEKLEDIFQQEFSTP 
SRKGLVLQLIQSYORMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
CNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVDI FN KELLL I P I HLE VH W S LI S VD VRRRTIT Y FDS Q 
RTLNRRCPKH I AKYLQAEAVKKDRLD FHQGW KG YFKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 

I V 


5988 


1292 


410 


FKKYFLSFLGLLESSHSRDRIHNLVLMFLLATHNLVWWFTCRFQ - 

RLDCIYLNAGIMPNPQLNIKALLFGLFS\AEGLLTQGDKITADG 

LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 

FSLEDFQHSKGKEPYSSSKYATDLLSVALNRNFNQQGLYSNVAC 

PGTALTNLTYGILPPFIWTLLMPAILLLRFFANAFTLTPYNGTE 

ALVWLFHQKPBSLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 

KFYOKLLELEKHIRVTIQKTDNQARLSGSCL 


59B9 


194 


2610 


AMDFPQHSQHVLEQLNQQRQLGLLCDCTFVVDGVHFKAHKAVLA 
ACSE Y FKMLFVDQKDWHLD I SNAAGLGQ VLEFM YTAKLS LS PE 
NVDDVL\AVATFLQMQDI ITACHALKS LAEPATS pggnaealat 
EGGDKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
EQEMEVEPARKGEEEQKEQEEOEEEGAGPAEVKEEGSQLENGEA 
P EENENE ES AGTDS GQELG SEARG LRS GT YGDRTE S KAYGS V I H 
KCEDCG KEFTHTGNFKRH I R I HTGEKP FS CRECS KAFS DP AACK 
AHEKTHS PLKPYGCEECGKS YRLI SLLNLRKKRHSGEARYRCED 
CGKLFTTSGNLKRHQLVHSGEKPYQCDYCGRSFSDPTSKMRHLE 
THDTDKEHKCPHCDKKFNQVGNI.KAHLKIHIADGPLKCRECGKQ 
FTTSGNL KRHLR I HSGE KP YVC IHOQRQ FADPGALQRHVR I HTG 
EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 
SQLWraiRHHDNIRPHKCSVCSKAFVNVGDLSKHIIIHTGEKPY 
LCDKCGRG FNRVDNLRSH VKTVHQGKAG I K I LEPEEG SEVS WT 
VDDMVTIATEALAATAVTQLTVVPVGAAVTADETEVLKAEISKA 
VKQVQEEDPNTH I LYACDS CGDKFLDANS LAQHVR IHTAQALVM 
FQTDAD F YQQ YGPGGTW PAGQ VLQ AG ELV FR PRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 " 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD - 

SGFVSLSRLGPSLRDKDLEMEELMLQDETLLGTMQSYMDASL1S 

LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 

PRWGQSPPPQQRSDGEEEEEVASFSGQILAGELDNCVSS I PDFP 

MHLACPEEEDKATAAEMAVPAAGDES I SSLSELVRAMHPYCLPN 

LTHLASLEDELQEQPDDL'lTiPEGCWLEIVGQAATAGDDLEIPV 

WRQVSPGPR PVLLDDSLETSSALQLLMPTLESETEAAVPKVTL 

CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 

GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 

QVDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENSSPKN 

LERSAGQSSPAKEGPLDLYPKLADTIQTNPIPTHLSLVDSAQAS 

PMPVDSVEADPTAVGPVLAGPVPVDPGLVDLASTSSELVEPLPA 

EPVLINPVLADSAAVDPAWPISDNLPPVDAVPSGPAPVDLALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 

nurl t*at i Hp 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D«=Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DP VPNDLTP VDP VLVKS R PTDPRRGAVS S ALGGS APQLLVES ES 
LDPPXTIIPEVKEVVDSLKIESGTSATTHEARPRPLSLSEYRRR 
RQQRQAETE E RS PQ P PTGKW PSL P ET PTGLAD I PCLVI P PAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLS MAPPLSLGLPGHGAPQTEPTKVE VKP VPAS PHPK 
KKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRB 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSPLPTPRTQGSE 
DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 
GNSGGVDIPQEKRPLDRLQAPELANVAGLTPPATPPHQLWKPLA 
AVSLLAKAKSPKSTAQEGTLKPEGVT3AKHPAAVRLQEGVHGPS 
RVHVG SGDHD YC \ VRSRT P P KK\ MPAJjLI P EVGS RWNVKRHQD I 
TI KPVLSLGPAAPPPPCIAASREPLDHRTS SEOADPSAP CLAPS 
SLLSPEASPCRNDMNTRTPPEPSAKQRSMRCYRXACRSASPSSQ 
GWQGR3GRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVLQKERAIEERRWFIGK 
IPGRMTRSELKQRFSVFGEIEECTIHFRVQGDNYGFVTYRYAEE 
AFAAI ESGHKLRQADEQP FDLCFGGRRQFC KRS YSDLDS NRED F 
DPAPVXSKFDSLDFDTLLKQAQKNLRR 


5991 


I 11 A 


1379 


RLSSHFSQCSPSIYC\TKFDKQGNVTSFERKKTELYQELGLQAR 
DLRFQHVMS I TVRNNR 1 1 MRME YLKAVITP ECLL I LDY RNLNLK 
QWLFR 3L PSQLSG EGQLVTYPLP F EFRAIEALLQ YW I NTLQGKL 
SILQPLILETLDALGDPKHSSVDRSKLHILLQNGKSLSELETDI 
. KIFKESILEILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLENYYRLADDLSNAARELRVL IDDSQS 1 1 FINLDSHRNVMM 
RLNLQLTMGTFSLSL FGLMGVAFGMNLESS LE EDHRI FWL I TG I 
MPMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGGIVEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS I LGTGLLWLPGG I KLFAVF YTLGNLAALAS TCFLMGP VK 
QLKKMFBATRLIATIVMLLCFIFTLGAALMWHKKGLAVLFCILQ 
FLSMTWYSLSYIPYARDAVIKCCSSLLS 


5993 - 


1650 


594 


AEGLGS WAVWAGIjG WAGRHMEAGGATGALGVG C KL PSAFC FPGS 
SVAMDMFQKVEKIGEGTYGVVYKAKNRETGQLVALKKIRLDLEM 
EGVPSTAIREISLLKELKHPNIVRLLDWHNBRKLYLVFEFLSQ 
DLKKYWDSTPGSELPLHL I KS YLFQLLQGVSFCHSHRVIHRDLK 
PQNLLINEUSAIKLADFGLARAFGVPLRrYTHEVVTIiWYRAPEI 
LLATRFYTTAVDIWSIGCIFAEMVTRKALFPGDS\EXDQ\LFRI 
FRMLGT PS EDTWPGVTQ LPD YKGS FPKWTRKGLEE I VPNLE PEG 

RDLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


5994 


394 


1934 


AGEVQLHVWIRGMRIQPQ/KAAAIIDLDPDFEPQSRPRSCTWP"E" 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPGILGAVTGPRKGGSRRNAWGNQSYAELISQAIESAPEKRLT 
LAQIYEWMVRTVPYFKDKGDSNSSAGWKNSIRHNLSLHSKFIKV 
HNEATGKSSWWMLNPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTr 
FRPRS SSNASS VSTRLS PLRPESEVLAESI PAS VSS YAGGVPPT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLPS PAEG PLS AGEGCFSSS QALEALLTS DTP P PP AD VLM TQVD 
P ILS QA PTLLLLGGLPSS SKLATG VGLC P KPLEAPG PSSL VPTL 
SM I AP P P VMAS AP I PKALGTP VLTP PTE AASQDRMPQDLD LDMY 
MENLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 - 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGSA PGL PRGRAERS AAGSG RGPSREERGAAAAAAA 
AEMMEELHSL\DP\RRQELLEARF\TGLGVSKGPLNSESSNQSL 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D»Aepartic Acid, E= 
Glutamic Acid, F»Phenylalanine, Q=Glycine, 
H*HiGtidinc, I-Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine, 
P«Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDYFERRVEQPLYGLDGSAAKEATEEQSALPTLMSVMLAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQISI 
QHRQT \QS DLT I EKI SALENS KNS DLE KKEGR IDDLLRANCDLR 
RQI\DEQQKMLEKYK\ERLNRCFDNEPRNFLIEKSKQEKMACRD 
KSMQDRLRLGHFTTVRHG AS FTEQ WTDG YAFQNL I KQQE R I NS Q 
REEIERQRKMLAKRKPPAMGQAPPATNEQKQRKSKTNGAENETL 
TLAEYHEQEEIFKLRLGHLKKEEAEIQAELSRLERVRNLHIREL 
KRIHNEDNSQFKOHPTLNDRYLLIiHLLGRGGFSEVYKAFDLTEQ 
R Y VAVK I HQLNKNWRDEKKEN YHKHACRE YR I HKEIiDHPR I VKL 
YDYFSLDTDSFCTVLEYCEGNDLDFYLKQHKLMSEKEARSIIMQ 
IVNALKYLNEIKPPIIHYDLKPGNILLVNGTACGEIKITBFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPXISNK 
Vtf V WS VG VI FYQCL YGRKPFGHNQSQQDI LQENTI LKATEVQFP 
PKPWTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPHIRKSV 
STSS PAGAAI AS TSGAS NNS S S N 


5996 


1*12 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEG YLNSASEGEE FC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVIiSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDASPGRPS PFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP 


5997 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEGYLNSASEGEEFC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDSCYL\ANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP 


5998 


16X2 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTUR/SWVSWRSRPGCE"" 
LFS I VVFGS I VNEGYLNS ASEG EE FC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQ I S S VKDRKK\ AVLSGHP WSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNBGTDAS PGRPSPFS 
FFS I FTWSLTAALAVRRFKDLSFQEE YSTLFP \ ASAQP 


5999 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIVWGFHHKKGCQVEFSYPPLIP 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFG I SC YR \ Q I EAKALKVRQAD I TRETVQKS VCVLS KL PLYG 
LLQAKLQL I THA Y FEE KDFS QI S I LKEL YEHMNSSLGGAS LEGS 
QVYLGLSPRDLVLHFRHKGLILFKLILLEKKVLFYISPVNKLVG 
ALMT VLSL F PGM I EHGLSDCS QYR PR KS MS EDGGLQESNPCADD 
FVSASTADVSHTNLGTIRKVMAGNHGEDAAMKTEEPLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNL F P KDS VPS ES LP I TVQPQANTGQ WL I PGL3 S GLE 
EDQYGM PLA I F?KG YLCLP YMALQQHHLLSDVTVRG FVAGATN I 
LFRQQKHLSDAIVEVEEALIQIHDPELRKLLNPTTADLRFADYL 
VRHVTENHDUVFLDGTGWEGGDEW IRAQFAVY I HALLAATLQLV 
LFR I VNVAKKIGNVMVTT\ SRNWQTGK\AVGQS VGGAFS \ SAK 
TA\MSSWLSTFTTSTSQSLTEPPDEKP 


6G00 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
DD FEDKVF YRTEFQNREFKATM CNLLAYLKHLKGQNEAALE CLR 
KAEE L I QQEHADQAS I RSLVTWGNYAW VY YHMG RLS DVQ I YVD K 
VKHVCEKFSSPYRIESPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQ YLKVLLALKLHKMREEGEEEGEGEK \ LVEEALEKAPG \ VTD V 
LRSAA\KFYRGKDEPDKAIELLKKALEYIP\NNAYLHCQIGCCY 
RAKVFQVMNLRENGMYGKRKLLELIGHAVAHLKKAJDEANDNLFR 
VCS I LASLHALADQYEDAEY YFQKEFS KELTPVAKQLLHLRYGN 
FQLYQMKCEDKAIHHFIEGVKINQKSREKEKMKDKLQKIAKMRL 

SKNGADSEALHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 
SWNGB 


6001 


17$ 


1038 


AFAHSPSRGHRKTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 
WLHFDADGSG YLEGKELQNL I QE LQQARKKAGLELS PEMKTFVD 
QYGQRODGKIGIVELAHVLPTEENFLLLFRCQQLKSCENEFMKT 
WRKYDTDHSGFIETEELKNFLKDLLEKANKTVDDTKLAEYTDLM 
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SEQ 
ID 

MO . 
Ei\J I 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D»Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, GoGlycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L^Leucine, M-Methionine, N»Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y= Tyrosine, X«Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
_ \=possible nucleotide insertion) 








LKLFDSNKDGKLELTEMARLLPVQENFLLKPQGIKMCGKEFNKA 
PELYDQDGNGYIDENBLDALLKDLCEKNKQDLDINNITTYKKNI 
MALSDGGKL YRTDLAL I LCAGDN 


6002 


977 


! 81 


IiAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLrfPHS - 
S MAQRSDL L ELDCQLTRDR WWS HDE NLCRQSGLNRDVGSLD F 
EDLPLYKEKLEVYFSPGHFAHGSDRRMVRLEDLFQRFPRTPMSV 
EIKGKNEELIREQ/VLVRRYDRNEITIWASEKSSVMKKCKAANP 
EMPLSFTI SRGFWVLLS Y YLGLLPFI P I PEK? FFCFLPN I INRT 
YF P?S CS CLNQ LLA WS KWL I MRKSL IRHLEE RGVQVVFWCLNE 
ES DFEAAFS VGATG V I TD YPTALRHYLDNHGPAARTS 


6003 
6004 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 

APKTSGNP ANSARKPGSAGG PKVGAGAS KEGGAGAVDEDDFI KA 

FTOVPSIQIYSSRELEETLNKIREILSDDiCHDWDQRANAIiKKIR 

SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 

KLSTVLGNKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFII 

RHTH VPRL I P L I TSNC ?S KS VP VRRRS FE FLDLLLQEWQTHSL E 

RHAAVLVETIKKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 

NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 

KWSTANP STVAGRVSAGSS KAS SLPGSLQRSRS DIDVNAAAGAK 

AHHAAGGSVRSGRLGAGAIiNAGSYASLEDTSDKLDGTASEDGRV 

RAK LS AP LAGMGMAKADS RGRSRTKMVSQSQ PGSRSGS PGR VLT 

TTALS TVSS G VQR VL VNS AS AQKRS K I PRSQGCS REAS PS RLS V 

ARSSR I PRPS VSQGCS REASRESS RDTSP VRS FQPLASRHHSRS 

TGALYAPEVYGASGPG YGI SQSSRLS SSVSAMRVLNTGSDVEEA 

VADALLLGD IRTKKKPARR RYESYGMHSDDDANSDASSACS ERS 

YSSRNGSIPTYMRQT\EDV\AEVLWRC^SmfSERKEGLliGLQN 

LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 

QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 

FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 

FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 

FTMLLGALPKT FQDGATKLLHNHLRNTGNGTQS SMGSP LTR PTP 

RSPANWSSPLTSPTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 

VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 

GGDATDSSQTAL\ DNKASLLHS MPTHS S P RSRD YNPYN YS D S I S 

PFNKSALKEAKFDDDADQFPDDLSLDHSDLVABLLKEr,SNHNER 

VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 

IRALALKVLRE ILRHQPARFKN YAELTVMKTLEAHKDPHKE VVR 

SAEEAASV\LATS I\SPEQCIKVLCPI IQTADYPINLAAIKMQT 

KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 

HAVIGDELKPHLSQLTGS KM KLLNLYI KRAQTGSGGADPTTDVS 

GQS 




140 


4098 

J 


GKLRAFRGMRRIi I CKR I CD YKS FDDEES VDGNR PS S AAS A FKVP 
APKTS GNPANSARKPGS AGGPK VGAGAS KEGGAGAVDEDDFIKA 
FTDVPS I Q I YSSRELEETLNKIRE ILS DDKHDWDQRANALKKIR 
SLLVAGAAQ YDCFFQHLRLLDGALKLS AKDLRSQWREACI TVA 
HLSTVLGNKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFII 
RHTHV PRL I PLITSNCTS KS VP VRRRS FEFLDLLLQEWQTHS LE 
RHAAVLVET I KKG I HDADAE AR VEAR KTYMGLRNH FPGEAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSIiPGSLQRSRSDI DVWAAAGAK 
AHHAAGQS VRS GR LGAGALNAGS YASLEDTSDKLDGTAS EDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREASPSRIiSV 
ftKt>o k i FS VS QGCS REAS RES SRDTS PVR S FQ PLAS RHHSRS 
TGALYAPE VYGASG PG YGISQS S RLS SS VSAMRVLNTGS DVE EA 
VADALLLGD I RTKKKPARRR YES YGMHS DDDANSDAS SACSERS 
YSSRNGSIPTYMRQT\EDV\AEVLNRCASSNWSERKBGLLGLQN 
bLKNQRTLSRVELKRLCE I FTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVIJiTQLLKKMGADLLGSVOAJCVQKALDVTRES 
FPNDLQFNI LMR FTVDQTQTP S LKVKVA I LKYI ETLAKQMDPGD 
PINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, ^Cysteine, D^Aspartic Acid, 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Leucirie, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S~Serine, r=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMLLGALP KTFQIXJATKLLHNriLRNYGNGTQSSMGS PLTRPTP 
RSPANWSSPLTSPTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 
VTEAIQNPS FRSQEDMNEPIjKRDSKKDDGDSMOGGPG \MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRS3DYNPYNYSDSIS 

pfnksalkeamfdddadqfpddlsiidhsdlvaelilkblsnhker 
veerkialyelmkltoeesfsvwdehfktillllletlgdkept 

IRALALKVXiREILRHQPARFKNYAELTVMKTIiEAHKDPHKEVVR 
SAEEAASV\LATSI\SPEQCIKVLCPIIQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHLSQLTGS kmkllnlyi KRAQTGSGGADPTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDAIiL 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 

XRQKKBRM LLCRQLGDS SGEG PE F VEEEEE VALRSDS EGS DYTP 

GKKKKKKLGPKKBKKSXSKRKEEEEEDDDDDDDSKEPKSSAQLL 

BDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 

KMMMVLGAKWRS FSTNNP FKGS SGAS VAAAAAAAVAWES MVTA 

TE VAP P PP P VE VP IR KAKTKEGKGPNARRKPKGS PRVPDAKKP K 

P KKVA PLK I KLGGFGS KR KRS 3 S EDDDLD VESDFDBAS INS YS V 

SDGSTSRSSRSRKKLRTTKKKKKGEEEVTAVDGYETDHQDYCEV 

CQQGGEIILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 

QWEAKEDNSEGEEILBEVGGDLEEEDDHHMEFCRVCKDGGELLC 

CDTCPSSYHIHCLNPPLPEIPKGEWLCPRCTCPALKGKVQKILI 

WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 

YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEBK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRILNHSVDKKG 

HVHYLIKWRDLPYDQASWESEDVEIQDYDI.FKQSYWNHRELMRG 

EEG RPG KKL KKVKLRKLERPPETP TVDPTVKYERQPE YLDATGG 

TLHPYQMEGLNWLRFSWAQGTDTIIiADEMGLGKTVCiTAVFLYSIi 

YKEGHS KGP FLVSAPLSTI IN\ WEREFEMWAPDM YV\ VT YVGDK 

DSRAIIREXEFS\FEDNAIRGGKKASRMKKEASVKFHVLLTSYE 

LITIDMAILGSIDWACL1VDEAHRLKNNQSKFFRVLNGYSLQHK 

LLLTGTPLQNNLEELFHLLNFLTPERFHNLEGFLEEFADIAKED 

QIKKLHDMLG\PHMLRRLKADVFKNMPSKTEI>IV\RVELSPM\Q 

KKYYK\YILHSKFLKALN\ARGGGNQVSLIiNVVMDLKKCCNHPY 

LFPVAAMEAPKMPNGMYDGSALIRASGKLLLLQKMLKNLKEGGH 

RVLIFSQMTKMLDLLEDFLEHEGYKYERIDGGITGNMRQEAIDR 

FNAPGAQQFCFLLSTRAGGLGINLATADTVIIYDSDWNPHNDIQ 

AFSRAHR IGQNKiCVMI YR FVTRAS VEER I TQVAKKKMMLTHLW 

RPGLGS KTGSMS KQELDD I L KFGTBELFKDEATDGGGDNKEGED 

S SV I H YDDKAI ER LLDRNQDETEDTELQGMNfi YLSS FKVAQ YW 

REEEMGEEEEVERE 1 1 KQEES VDPDYWEKLLRHHYEQQQEDLAR 

Nl^KGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 

DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 

QRKAFLNAIMRYGMPPQDAFTTQWLVRDLRGKSEKEFKAYVSI.F 

MRHLCE PGADGAE T FADG VPREGLS RQH VLTR IG VMS LIRKKVQ 

EFEHVNGRWSMPELAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 

NTPAPVPPAEDGIKIEENSLKEEESIEGEKEVKSTAPETAIECT 

QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEEKSAIDLTPIWEDKEEKKEEEEKKEVMljQNGETPK 

DLNDE KQKKN I KQR FMFNIADGGFTBLHS LWQNE ERAATVTKKT 

YEI WHRRHDYWLLAGI INHG YARWQD IQNDPRYAI LNEPFKGEM 

NRGNFLEIKNKFLARRFKLLEQALVIEEQLRRAAYLNMSEDPSH 

PSMALNTRFAEVECi»AESHQHLSKESMAGNKPANAVLHKVLKQL 

EELLSDMKADVTRLPATIARIPPVAVRLQMSERNILSRLANRAP 

EPTPQQVAQQQ 




1 


965 


DNDFLRNTVHRHE P P VTAE PI RLLAENEDWWDKPSS I PVHPC 
GRFRHNTVIFILGKEHQLKELHPLHRLDRLTSGVLMFAKTAAVS 
ERIHEQVRDRQLEKE YVCR VEGEFPTEEVTCKEP 1 LWS YKVGV 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
LGHPI LMDP I YNS VAWGPSRGRGGYIPKTNEELLRDLVAEHQAK 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine . C^CvRfrf*! nt* n-fiona >-t- ■i <~ o_ 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HeHistidine, I=Isoleucine, K=I*ysine, 
L=Leucine. M«=Methionine N-z^na ran\ no 
P=Proline, Q=<5lut amine, R=Arginine, 
S=Serine, T«Threonine, V-Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








QS LD VLDLCEGDLS PGLTDSTAPSS ELGKDDLEELAAAA \ QKM E 
E VAE AAPQELDTI ALAS EKA VETDVMNQ \ RQT\ TLCRVPAGATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


HELGQVEYVFTDKTGTLTENEMQFRECSINGMKYQEINGRLVPE ~ 
G P TPDS SEGNLS YLS S LSHLNNLSHLTTS S S FRTS PBNETELI K 
EHDLPFKAVSLCHTVQINNVQTDCTGDGPWQSNIiAPSQLEYYAS 
SPDEKALVEAAARIGIVFIGNSEETMEVKTLGKLERYKLLHILE 
FDSDRRRMSVIVQAPSGEKLLFAKGABSSILPKCIGGEIEKTRI 

hvdefalkglrtlciayrkftskeyeeidkrifeartalqqr\e 
eklaavfqpi e kdli llgatavedrlqdkvret i ealrmag i kv 

WVLTGDKHETAVSVSLSCGHFHRTMNILELINQKSDSECAEQLR 
QLARRITEDHVIQHGLWDGTSLSLALREKEKLFMEVCRNCSAV 
trt-fu^xv/iivv J-ivLil ls.xy r'ilK.r'X I LtAVGlX5ANDVSMIQEAHV 

gigimgkegrqaarnsdyaiarfkflskllfvhghfyyiriatl 
vqyffyknvcfitpqflyqfyclfsqqtlydsvyltly\nicft 
slp i hi ys lleqhvdphvlqnkptkyrd is knrlls iktfl ywt 
ilgfshafifffgsylligkdtsllgngqmfgnwtfgtlvftvm 
vitvtvkmalethfwtwinhlvtwgsiifyfvfslfyggilwpf 

LGSQNMYFVFIQIiLSSGSAWFAIILMVVTCIiFLDIIKKVFDRHIi 
HPTSTEKAQLTETNAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRCSPTHISRS WSASDPFYTNDRS ILTLSTMDSSTC 


6008 


4554 


10 Q9 


AGVRRAGARRG PGRALPAGATAVP P PSARRRRRCPAPEHAG PAR 
ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 
DFKQ FE PNDFYLKNTTWEDVGLWDPS LTKNQDYRTKPFCCSACP 
FSS KFFS A YKSH FRNVHS EDFENR I LLNCPYCTFNADKKTLE TH 
IKIFHAPNASAPSSSIiSTFKDKNKNDGLKPKQADSVEQAVYYCK 
KCT YRDPL YE I VRKHI YREHFQHVAAP YI AKAGE KS LNGAVP LG 
SNAREESSIHCKRCLFMPKSYEALVQHVIEDHERIGYQVTAMIG 

htnvwprskplmliapkpqdkksmglpprigslasgnvXrslp 
sqqmvnrls i pkpnlnstgvnmmss vhlqqnnyg vks vgqgysv 

GQSMRLGLGGNAPVS IPQQSQSVKQLLPSGNGRSYGLGSEQRSQ 
APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGOSSSKPAAA 
ATGPPPGNTSSTQKWKICTICNELFPENVYSVHFEKEHKABiCVP 
AVAliyiMKII^FTS KCL YCNRYLPTDTLLNHMLIHGLSCPYCRS 
TFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDLTLQQGSHTNIH 
LLVTT YNLRDAPAES VAYHAQNNPPVPP 1CPQPKVQEKADI p VKS 
SPQAAVP YKXDVGKTLCPLCFS I LKGP ISDALAHHLRERHQVIQ 
TVHP VEKKLT YKC I HCLGVYTSNMTAS T I TLHLVHCRG VGKTQN 

GQDKIWAPSRLNQSPSLAPVKRTYEQMEFPLLjKKRKLDDDSDSP 
SFFEEKPEEPWLALDPKfJH\ FnnQVPapvQPT titvcpV vnnvo 

TRREIEKIJVASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
LGFNMKELNKVKHEMDFDAEGLFENKDEKDSSVNASKTADKKbN 
LGKEDDSSSDSFENLEEESNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEDA5ESEEKLDQKEDGSKYETIHLTSEPTKLMHNASDS 
EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDB 
SS QSEDARS S KPAAKKKATMQGDREQLKWKNSS YGKVEG FWS KD 
QSQWXNAS ENDERLSNPQI EWQNSTIDS EOGEQ FDNMTDG VAE P 
MHGS LAGVKL S SQQA 


6009 


4272 


1534 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC ' 
HLVLGVLVPVARQSSHSAGPAQSAFR*TGTGSGTPKAAEQSGYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQ AS QRRT VFTAGGGECLGAKS VRAS VFTGNQ PGVMG LL 
NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCRIHIVDAVC*SEJHH*DHFX*AAAFIiENSTIIS*VAPGSWQDHA 
VLQKEVQASVRCRGFESVDTAPAGFWAHSPPGLQGEPTTTSVSL 
FVIAPQIX5EGVPFVEGQLVTVLGLVVPQS IRHTFVHHTQLFLHP 
I * KLGALD VAFLHLLTLVCS S FNVAYG *GKNGGTTLHQL FAEVN 
AVTRGSAVQRRPS ITISS IHVDTKI QQELHDVMVAGADGWQWG 
DPFWGIAGI FHLI DDPUHQIELSFQRRV* EQCQGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDXjLRGGDRGHWVI VLCRLGSLVGGLGTDELLWFGGR* LI TIG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine . C=Cysteine. DaAsoarhic Acid E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
II=IIistidine, I=Isoleucine, K= Lysine, 
L^Leucine, M=Methionine, N^Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
K=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








I * * RGRLSGEWGCGLGRGELFQVS IG IGVS I VHIGQGDHBVLGG " 
AGL VERGALHATGQGVEAL VQQ LLD VGPAGALGLCDGAALFQG P 
GRVGQLPAEGLQ VCI TLVAQWRMHDGRELGGAE W PWQALHGAA I 
CG VGGAILLKALSQ YFLKGG * RLWCARGQ * P VKKRQRRWRG* TR 
R *NGLTIHCFN* L I *GAVCCRL VI LRWQGLLEVHG VYGT * IHCL 
GS FPGRLWP* PPI SQERPNGHCQWE FRLAVPS WKCRWSRWRVRG 
TWRYGNPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
L P P FQG ACR P RTQ RCR TWVC P I AW RQ LLA YTRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM " 
AG I S QNAKTGDL PAFGECVGI AS KALCGLTEAAAQAAYLVGI FD 
PNSQAGHQGLVDPIQFARANQAIQMACQNLVDPGSSPSQVLSAA 
TIVAKHTSALCNACRIASS KTANPVAKRHFVQSAKEVANSTANL 
VKTI KALDGDFSEDNRNKCRI ATAPL I EAVENLTAFASNPEFVS 
I PAQ I S S EGSQ AQE P I LVSAKPMLES 5 S YL I RTARSLA INP KDP 
PTW S VLAGHS H7VS DS I KS L I TS IRDKAPGQRECD YS I DG INRC 
IRDIEQASLAAVSQSLATRDDIS VEALQEQLTS WQE I GHLI DP 
IATAARGEAAQLGHKGTQLASYFBPLILAAVGVASKILDHQQQM 
TVLDQTKTIAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDD I M VTI .NEAASEVGLVGGMVDAI AEAMS KLDEGTPPE P KG 
TFVD YQTT WKYS KAI AVTAQEMMTKS VTN PB ELGGLASQMTSD 
YGHLAFQGQMAAATAEPE E I G FQ I RTR VQDLGHGC I FLVQKAG\ 
ALQVCPTDS YTKRE LIECARAVTE KVSLVLSALQAGNKGTQACI 
TAATAVSG I 1 ADLDTTI M FATAGTLNAENS ET FADHREN I LKTA 
KAL VEDTKLLVSGAAST PDKLAQAAQS SAATI TQLAE WKLGAA 
SLGSDDPETQWIjINAI KDVAKALSDLISATKGAASKPVDDPSM 
YQL KGAAKVMVTNVTSLLKT VKA VED EATRGTRA LEATIBC IKQ 
ELTVFQS KDVPEKTS S P EE SIR MTKG t tmat2vvaviv a^uc norsv 
DVIATAWLSR KAVS DMLTACKQAS FH PDVSDE VRTRALRFGTEC 
TLGYLDLLEHVLV I ZiQKPTPELKQQLAAFSKRVAGAVTEL I QAA 
EAMKGTEWVDPBDPTVIAETELLGAAAS I EAAAKKLEQLKPRAK 
P KQADETLDFEEQ I LEAAKS I AAATSALVKSAS AAQR ELVAQGK 
VGS I PANAADDGQWSQGLI SAARMVAAATS SLCE AANAS VQGHA 
SEEKLISSAKQVAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
ICRASDNLVRAAQKAAFGKADDDDVWKTKFVGGIAQIIAAQEEM 
L KKERELE E ARKKLAQ I RQQQ YKFLPTEL REDBG 


6011 


44S 


183* 


LLQPAMRKS PGLS DCLWAW I LL LSTLTGR S YGQPSLQDEL KDNT 
TVFTRILDRLLDGYDNRLRPGLGERVTE VKTDI FVTS PGP VSDH 
DMEYT I DVFFROS WKDERLKF KGPMTVIjRIjNNT >M n <z k TtaTDnT p 
FHNGKKSVAHNMTMPNKLLRI T3DGTLLYTMRLTVR\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSWVAED 
GSRLNQ YDLLGQTVDSGI VQS STGE YVVMTTHFHL KRKIG YFV I 
QTYLPCIMTVI LSQVS FWLNRES VPARTVFGVTTVLTMTTLS IS 
ARNSL P KVAYATAMD W FI AVC YAFVFS AL I EFATVN YFTKRG YA 
WDGKSWPEKPKKVKDPLIKKNNTYAPTATSYTPNLARGDPGLA 
TIAKSATI EPKEVKPETKPPEP KKTFNS VSKIDRLSRIAFPLLF 
G I FNL VYWAT YLNRE PQLKAPT PHQ 


6012 


351 


5013 


PAELFQS FAI WHKBL YDWRLG P WNQCQPVI SKSLEKPLECI KGE 
EG IQ VRE IACI QKDKD I PAED 1 I CE YFE PK PLLEQACL I PCQQD 
CIVSEFSAWSECSKTCGSGLQHRTRHVVAPPQFGGSGCPNLTEF 
QVCQSSPCEAEELRYSLHVGPWSTCSMPHSRQVRQARRRGKNKE 
RE KDRS KG VKDPEARE L I KKKRNRNRQNRQENKYWD IQ IG YQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCS KTCHDMVS PAGTRVRTRTI RQFPIGS EKECPE FEEKE PCLS 
QGDGWPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTALCGGG 
I QTR E VYCVQANENLLS QLSTH KNKBAS KPMDLKLC TG P I PNTT 
QLCHIPCPTECEVSPWSAWGPCTYENCNDQQGfCKGFKLRKRRIT 
NE PTGGS GVTGNCPHLLEAI PCEEPACY DW KAVR LGDCE P DNG K 
ECGPGTQVQEWCINSDGEEVDRQLCRDAI FP I P VACDAPCPKD 
CVLSTWSTWS SCSHTCSGKTTEGKQIRARS ILAYAGEEGGIRCP 
NSSALQEVRSCNEHPCTVYHWQTGPWGQCIEDTSVSSFNTTTTW 
NGEASCSVGMQTRiCVICVRVNVGQVGPKKCPESLRPETVRPCLL | 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G«Glycine, 
H«Histidine, I-Iaoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q^Glutamine, R^Arginine, 
Scserine, T=Tfcreonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKiCDCIVTPYSDWTSCPSXSCKEGDSSIRKQSRHRVIIQLPAN 
GGRDCTDPLYESKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 
VQQDS P \GAQEGCGPGRQARAI TCRKQDGGQAGI HE CLQ YAG PV 
PALTQACQ I PCQDDCQLTSWSKFSS CNGDCGAVRTRKRTLVGKS 
KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGDIKECGQGYRYQAMACYDQNGRLVETSRCNSHGYI 
EEACIIPCPSDCKLSEWSNWSRCSKSCGSGV3CVRSKWLREKPYN 
GGRPC PKLDHVNQ AQ VYE WPCH S DCNQ YLWVTEPWS I CKVTP V 
NMRENCGEG VQTRKVR CMQNTAJDG PS EHVEDYLCDPE EMPLGS R 
VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 
GRSCPNAVEKEPCNLNKNCYHYDYNVTDWSTCQLSEKAVCGNGI 
KTRMLDCVRSDGKSVDLKYCEALGLEKNVfQMNTSCMVECPVNCQ 
LSDWSPWSECSQTCGLTGKMIRRRTVTQPFQGDGRPCPSLMDQS 
KPCPVKPCYRWQYGQWSPCQVQEAQCGEGTRTRNISCWSDGSA 
DDFS KWDBE FCADI ELI I DGNKNMVLEE S CS QPCPGD CYLKDW 
SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQELENQHLCPEO>IL 
ET KSCYDGQC YE YKWMAS AWKGS SRT VWCQRS DG INVTGG CLVM 
SQPDADRSCNPPCSQPHSYGSBTKTCHCEEGYTEVMSSNSTLEQ 
CTLI P VWLPTMEDKRGDVKTS RAVHPTQPSSNPAGRGRTWFLQ 
PFGPDGRLKTWVYGVAAGAFVLLIFIVSMIYLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


~S013 


1161 


710 


GAFIAG\/PVQPVLtRYPNSLDTTSWAWRGPGVLKVLWLTASQPC"" 
S I VDVE FLPVYHPSP E ESRDPTLYANNVQR VMAQALG I PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
ARPESNDQPGRVCQAATAIi 


6014 


2857 


613 


EAVAGGME KS RMNLP KG PDTEjCFDKDSFMKEDFD VDHF VS DCKK 
RVQbEELRDDLELYYKLLKTAMVELINKDYADF\V3WLSTNLVGM 
DKALNQLSVPLGQLREEVLSLRSSVSEGIRAVDERMSKQEDIRK 
KKMCVLRL I Q VI RS VEKI E Kl LNS QSS KETS ALEAS S P LLTGQ I 
LERIATEFNQLQFHACQSK\GMPLLDKVRPRIAGITAMLQQSLE 
GLLLEGLQTSDVDIIRHCLRTYATIDKTRDAEALVGQVLVKPYI 
DEVI I EQFVESHPNGLQVMYNKLLE FVPHHCRLLREVTGGAI SS 
EKGNTVPGYDFLVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 
TISMDFVRRLERQCGSQASVKRLRAHPAYHSFNKKWNLPVYFQI 
RFR E I AGSLE AALTD VLEDAPAES P YCLLASHRTWSSLRRCWSD 
EMFLPLLVHRLWRLHSGR FWARYS VFV\N\BLSLRPISNES PKE 
IKKPLVTGSKEPSlTQGNTEDQGSGPSBTKPWSISRTQIjVYW 
ADLDKLQEQLPELLEIIKPKLEMIGFKNFSSISAALEDSQSSFS 
ACVPS LSS K 1 1 QDLSDS C FGFLKSALEVPRL Y RRTNKE VPTTAS 
SYVDSALKPLFQLQSGHKDKLKQAI I QQWLEGTLSESTHKYYET 
VSDVLNSVKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKIRL 
QLALDVE YLGEQIQKLGLQASDIKS FS ALAELVAAAKDQATAEQ 
P 


6015 

> 

6016. [ 


13 
13 


2237 
2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC - 
VKGKG S I>PLS AHG I WAWLS RABWDQVTVYLFC0DHKLQR YALN 
RITVPJRSRSGNELPLAVASTADLIRCKriLDVTGGLGTDEIiRLLY 
GMALVRFVNLIS ERKTKFAKVPLKCLAQEVNI PDWI VDLRHEIiT 
HKKMPH INDCRRGC YFVLDWLQKTYWCRQLEN SLRETWEL EEFR 
EGIEEEDQEEDKNIWDDITEQKPBPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KA I KAWNNPSP RVE C VLAELKGVTCJSNREAVLDAFLDDGFLVPT 
FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGHNARRF 
SAGQWEARRGWRLFNCSAS LDWPRMVESCLGS PCWASPQLLR 1 1 
F\KAMGQGtiQDE\EQEKLLRI CS I YTQSGENS LVQEGSEAS PIG 
KSPYTLDSLYWSVKPASSSPGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QES PTAENARLLAQXRGALQGSAWQ VSS EDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVI FWTKPVL\ EQRLEPS TCK\TDTLGL 
\ SCG VGS \ GNCSNSSSSNFEGAFLLEARGSIiH \ GL\ KTGLQLF 
ASGCAERRGU'S F VVE LSMS WES GAGPGLGSQGMDLVWS AWYGKC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Alanine, OCysteine, D=<Aspartic Acid, E= 
Glutamic Acid. F=Phenylalanine, G-Glycine, 
H=Histidine, I»Isoleucine, K-Lysine, 
L=:Leucine, M=Methionine, N»Asparagine, 
P«Proline, Q=Glutaraine, R^Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPLSAKGIWAWLSRAEWDQVTVYLFCDDHKLQRYALN 
R I TVWRS RSGNEL PLAVASTADL I RCKLLDVTGGLGTDE LRLLY 
GMALVRPVNLI SBRKT KFAKVPLKCLAQE VNI PD W I VDLRHELT 
HKKMPKINDCRRGCYFVLDWLQKTYWCRQLBNSLRETWELEEFR 
EGI EEEDQEEDKN I WDDI TEQKPEPQDDGKSTESDVKADGDS K 
GSEEVDSHCKKALSHKELYBRARELLVSYEEEQFTVLEKFRYLP 
KAI KAKNNPS PRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQLAALOI E YBENVDLfJDVL VP KPFS QFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F\KAMGQGLQDE\EQEKHiRICSIYTQSGENSLVOEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQCX)EEQGSVNDVKEEEKE 
BKEVLPDQVEEEEKNDDQEEBEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQKRGAbQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQLF 


6017 


203 




SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPEITYRLRNDS 
NFALQTMEPALPMPPVEELDVMFSELVDELDLTDKHREAMFAIiP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDQLNSMAARKSLLAL 
EKEEEEERSKTIBSI»K'TALRTKPMRFVTRFIDI>DGLSCILNFLK 
TMDYETSESRIHTSLIGCIKALMNNSQGRAHVLAHSESINVIAQ 
SLSTENIKTKVAVLEILGAVCLVPGGHKKVLQAMLHYQKYASER 
TR FQTliINDLDXSTGR YRD E VS LKTA I MS F I NAVLSQGAGVES L 
DFRLHLRYE\FLMLG1HPVMDKLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHIDTKSATQMFELTRKRLTHSEAYPHFMSILH 
HCLOMPYKRSGNTVQYWLLLDRIIQQIVIQNDKGQDPDSTPLEN 
FNI KNWRMLVNENE VKQWKEQAE KMRKEHNELQQKLEKKEREC 
DAKTQEKEEMMQTLNKMKEKIiEKETTEHKQVKQQVADLTAQLHE 
LSRRAVCAS I PGGPSPGAPGGPFPSS VPGSLLPPPPPPPLPGGM 
LPPPPPPLPPGGPPPPPGPPPIjGAIMPPPGAPMGLALKKKSIPQ 
PTNALKSFNWSKLPENKLEGTVWTEIDDTKVFKILDLEDLERTF 
S AYQRQQD FFVNSNSKQK EADA IDDTLS S KLKVKE LS V I DGRRA 
QNCNILLSRLKLSNDEIKRAILTMDEQ3DLPKDMLEQLLKFVPE 
KSD IDLLEE HKHE LDRMAKADRFLFEMS RI NHYQQRLQSLY FKK 
KPAERVAEVKP KVEAIRSGS EEVFRSGALKQIjLEWLAFGNYMN 
KGQRGNAYGFK I S SLNK IADTKSS I DKNTTLLHYLITI VENKYP 
SVLNLNEELRDIPQAAKVNMTELDKEISTLRSGLKAVETELEYQ 
KSQP PQPGDKFVSWSQFI TVASFS FSDVEDLLAEAKDLFTKAV 
KHFGE E AG KIQPDE FFG I FDQFLQAVS EAKQENENMRKKKE EE E 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDL SKLKRNRKR I TNQMTDS SRERP I TKLNF 


6018 
6019 


13 
2 


2510 
1066 


TISQSGGIRRRREAVWFEWNMDFSRLHMYSPPQCVPENTGYTY 
ALSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRLATTACTLGD 
GEA VGADS GTSSA VSLKNRAARTTKQRR STNKSAFS INHVS RQ V 
TSSGVS YGGTVSLQDAVTRRPPVLDESWIREQTTVDHFWGLDDD 
GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNCNMLS3RKDVLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
ACAGYFLLQILRRIGAVGQAVSRTAWSAIiWLAWAPGKAASGVF 
WWLGIGWYQFVTLISWIiNVFUiTRCLRNICICFLVLLIPLFLLLG 
LSLRGQG\NFFSFLPVI*NWASMHRTQRVDDPQDVFKPTTSRLKQ 
P LQGD SEAF P WHWMSG VEQQVAS LSGQ CHHHGEN LR ELTTLLQK 
LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 
HIiEDILGKLREKSEA IQKELEQTKQKT I S A VGEQLL PTVEHLQL 

ijUUbKb h Ltb b WRHVKTG CETVDAVQERVD VQVREMVKLliFS E D 
QQGGSLEQLLQRPSSQFVSKGDLQTMLRDLQLQIIjRNVTHHVSV 
TKQLPTS EA WSAVS E AG AS G I TEAQARAI VN5 ALKL YSQDKTG 
MVDFALBSGGGSILSTRCSETYETKTAIiMSLFGIPLWYFSQSPR 
WIQPDIYPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHIPKTL 
SPTGNISSAPKDFAVYGr.RNEYQEEGQLLGQFTYDQDGESLQMF 
QALKRPDDTAFQIVELRIFSNWGHPEYTCLYRFRVHGEPVK 
TPNDREPPPQRPPSSRRASHLAQE I TSAASLGDQTQILGSLTTA 
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j SBQ 
ID 

j jjo : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E«= 
Glutamic Acid, F- Phenylalanine , G^Glycine, 
H^Histidine, I«Ieoleucine, K=Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T« Threonine, VaValine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PV1TSAIRSMPGISSQILTNAQGQVIGTLPWWNSASVAAPAPA 
QS LQVQAVTPQLLLNAQGQV I ATLAS S PLPPP VAVRK\ PSTPES 
LLKSEVQPrKPTPTVPQPAWIASPAPAAKPSASAPlPITCSBT 
PTVSQLVSKPHTPSLDEDGINLEEIREFAKNFKIRRLSLGLTQT 
QVGQAIiTATEGPAYSQSAICRFEfdiDITPKSAQKLKPVLEKWLN 
EAE LRNQEGQQNLM E FVGGE PS KKRKRRTS FTPQA I EALNAYFE 
KNPLPTGQEITE IAKELNYDRE WRVWFCNRRQTLKNTS KLNVF 
QIP 


6020 
6021 


4953 


54 9 


EAIQFEVSIGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 

AHTKPVVTLTSYWEDI SHRLDAVNTLLAMAERLQTNI EALKSG I 

QG K I PANQLAELWLKL X DEV I EDTRYTL P L»T RGKANVTVLDTQ I 

RKLRSRSLSQIHEAAVRMRSBATDVKSTLAEIEDWLDKLMQLTE 

EPQNSKPDIIIWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 

GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 

GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 

FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 

GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 

AVDEKGWE YGI T I PPDHKPKS WAAEKMYHTHRRRRLVR KRKKD 

LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 

RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKSLBKQKHSA 

TTVFGANTPIVSCKFDRDYIYHLRCYVYQARNLLALDKDSFSDP 

YAH I C FLHRS KTTE I IHSTLNPTWDQTI I FDEVE I YGEPQTVLQ 

NP PKV I M E LFDNDQ VGKDEFLGRS I FSPWKLNS EMD 1 TPKLLW 

HPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPNLYMVP 

QG I RPWQLTAIEILAWGLRNMKNFQMAS I TSPS LWECGGERV 

ESVVIKNLKKrPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 

FGR KP WGQCT I ERLDRFRCDP YAGKE DI VPQLKAS LLS AP PCR 

DIV1EMEDTKPLLASKCLSSMSTALSJCMASPATVHLTEKEEEIV 

DffWS KFYAS SGEHEKCGQ YIQKGYS KLKI YNCELENVAE FEGLT 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQFREL PDS VPQE CTVR I Y I VRGLELQ PQDNNGL COPY I K I TLG 

KKVIE\DRDHYIPOTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDE KVGBT I IDLENP F\ LSRFG\ SHGG \ I PEEYCVSGVNTW 
RDS LR \ PTQ \ LLQNVARFKG FPQP I LS EDGS R I R YGGRDYS LDE 
FEANKILHQHLGAPEERLALHILRTQGIiVPEHVETRTLWSTFQP 
NIS\RYYLRVIimTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRS LDGEGN FNWR F VFP FDYL PAEQLC I VAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGN C / RGLDM I PDLKAMNPLKAKTAS LFEQ KSM KGWW 
P CYAEKDGAR VMAGKVEMTLEILNEKE ADERPAGKGRDE PNMNP 
K LDLPNRPETS FL WFTNPCKTM KFI VWRRFKWVI IG LLFLL ILL 
LFVAVLLYSLPNYLSMKIVKPNV 




4953 1 " 


549 

< 
1 
I 


EAIQFEVS2ernfGNKFDTTCKPIJ^TTQYSRAVFT)GNYYYYLW" 
AHTKP WTLTS YWEDI SHRLDAVNTLLAMAERLQTWI EALKSG I 
QGKI PANQLAELWL KL IDE V I EDTR YTLPLTEGKANVTVLDTQ I 
RKLRSRSLSQIHEAAVRMRSEATDVKSTIiAEIEDWLDKLMQLTE 
EPQNSMPDI IIWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQT I FLKYPQE KNNGPKVPVEL RVNI WLGLSAVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTS GLVGRHKFSDVTGKIKLKRE F 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPS ELTCPPGWEWEDDAWS YD INR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTAS STAG AKEELQDQEGWE YASL IG WKFHWKQRSS DTFRRR 

RWRRKMAPSETPOfcAlT CHAT nunii'PDnnnDvsT tiu/t...... 

*vr»««vru intra a i nvjftAHi i? KIjbvAMjADTTEDGDEKSLEKQKHSA 
TTVFGANTP I VS CNFDRD YI YHLR CYVYQARNLLALDKDS FSDP 
YAHI CFLHRSKTTE I IHS TLNPTWDQTI I FDEVE I YGEPQTVLQ 
N P PKVI ME L FDNDQVGKDEFLGR S I FSP WKLNS EMD I TP KLLW 
KPVMNGDKACGDVLVTAELILRGKDGSNLP I LP PQRAPNLYMVP 
3G IRPWQLTAIE ILAWGLRNMKNFQMAS ITSPSLWECGGERV 
2SWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
TORKPVVGQCTIERIiDRFRCDPYAGKEDI VPQLKAS LLSAPPCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine . C=Cvqt*f»i np n-fienavr-irt n ~ j ^ r?_ 
Glutamic Acid, F=Fhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MoMethionine, N=Asparagine , 
P^Proline, Q^Glutamine, R=Arginine, 
S=Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *=3top 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DI VI EMBDTKPLLASKCLSSMS TALS KMAS PATVH LT EKEE E I V 
DWWS KFYASSGEHE KCGQ Y IQ KG YS KLKI YNCE LENVAE FEGLT 
DFSDTFKLYRGKSDENEDPSVVGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVI E \DRDHYIPNTLNP VFGRMYELS CYLPQ E KDLK I SVYD YD 
TFTRDEKVGETIIDLEN'PF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSIiR\PTQ\LLQNVARFKGFPOPILSEDG<;RTPYr:nT7nvQT rur 

FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
NI S \ R YYLRVI I WKTKD VI LDEKS I TGE EM S D I YVKG W I PGNE E 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYU3FPRTLTCRHTI 
HPLQKS PGGNC/RGLDM I PDLKAMNPLKAKTAS LFEQKSMKGW W 
PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETS FLWFTNPCKTMKFI VWRRFKV7VI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6022 


4953 


549 


EAI Q FE VS I GNYGN KFDTTCKPLAS TTQYS RAVFDGN Y YY YLPW " 
AHTKPVVTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 
QG KI PANQIiAE LWLKL I DEVI EDTR YTLPLTEGKAN VTVLDTQ I 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDI I IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQT I FLKYPQE KNNGP KVP VELR VN I WLGLSAVEKKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKI KLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAAS PS ELTC P PG WE WEDDAWS YD I NR 
AVDE KG WE YG IT! PPDHKPKS WVAAEKMYHTHRRRRLVRKRKXD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDE KS LEKQ KHS A 
TTVFGANTPIVSCNFDRDYIYHLRCYVYQARNLLALDKDSFSDP 
YAHI C FLHRSKTTE I IKSTLNFTWDQT 1 1 FD EVE I YGE PQT VLQ 
NPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELII^GKDGSNLPILPPQRAPNLYMVP 
QGIRPWQLTAIEILAWGLRNMKNFQMASITSPSLWECGGERV 
ESWIKNLKOTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
KGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
D I VI EMEDTKPLLAS KCLS SMS TALS KMAS PATVHLT3KEEE I V 
DWWSKF YASSGEHEKCGQY IQKGYSKLKI YNCELENVAEFEGLT 
DFSDTFKLYRGKSD3NEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRB LP DS VPQECTVR I Y I VRGLELQPQDNNGLCDP YI KI TLG 
KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDEKVGETIIDLENPF\LSRFG\SHCX3\IPEEYCVSGVNTW 
RDS LR \ PTQ \LLONVARFKG FPO PILSEDGSRTB vnrB nvor ni? 

FEANKHjHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
N I S \RY YLRVI I WNTKD VI LDEKS ITGEEMS DI YVKG W I PGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LI IQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/RGLDMI PDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNR PETS FLW FTNPCKTM K FI VWR R FKWVI IGLLFLLI LL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


916 


S QELGMF VELIWLLNTT PDRAEQGKLTL LCDAXTDGS FL VHHFL " 

S FYLKANCKVCFVALIQSFSH YS I VGQKLG VSLTMARERGQLVF 

LEGL/rVCSGR\VFQAQKEPHPLQFLREANAGNLKPLFEFVREA 

LKPVDSGEARWTYPVLLVDDLSVLLSLGMGAVAVLDFIHYCRAT 

VCWELKGNMWLVHDSGDAEDEENDILLNGLSHQSHLILRAEGL 

ATGFCRDVHGQLRILWRRPSQPAVHRDQSFTYQYKIQDKSVSFF 
AKGMSPAVL 


S024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQLNELELLMEKSFWEEAB 
L PAELFQ KKWAS FPRTVLSTGhTONR YLVLAVNTVQNKEGNCEK 
RLVITASQSLENKELCILRNDWCSVPVEPGDIIHLEGDCTSDTW 
IIDKDFGYLILYPDMLISGTSIASSIRCMRRAVLSETFRSSDPA 
rRQMLIGTVLHEVFQJKAINNSFAPEKLQELAFQTIQEIRHLKEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correoponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti<3eH 
(AcAlanine, C=Cysteine, D»Aspartic Acid, E*= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=IIistidine, I=Isoleucine, K=Lysine, 
L=Leucine f (^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 1 








YRLNLSQDE I KQEVEDYLPS FCKWAGDFMHKNTSTDFPQMQLSlH 

PSDNSKDNSTCNIEWXPMDIEESIWSPRFGLKGKIDVTVGVKI 

HRGYKTKYKIMPLELKTGKESNSIEHRSQVVLYTLLSQERRADP 

EAGLLLYLKTGQMYPVPANHLDKRELLKLRNQMAFSLPHRISKS 

ATRQKTQLASLPQIIEEEKTCKYCSQIGNCALYSRAVEQQMDCS 

SVPIVMLPKIEEETQHLKQTHLEYFSLWCLMLTLESQSKDNKKN 

HQNIWLMPASEMEKSGSCIGNLIRMEHVKIVCDGQYLHNFQCKH 

GAIPVTNI^GDRVIVSGEERSLFALSRGYVKEItWTTVTCLLD 

RNLSVLPESTLFRLDQEEKNCDlDTPIiGNLSKLMENTFVSKKLR 

DLI IDFREPQFISYLSSVLPHDAKDTVACILKGLNKPQRQAMKK 

VLLS KDYTL I VGM PGTG KTTT I CTLVRI L YACGFS VLLTS YTHS 

A VDN I LLKLAKFK IG FLRS R\Q I OKVHPA I OO PTFHR T Q Tr c T 

KS\LALLEELYTSQLIDATTCMGINHPIFSRKIFDFCIVDEASQ 

ISQPICLGPLFFSRRFVLVGDHQQLPPLVLNREARALGMSESLF 

KRLEQNKSAVVQliTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 

NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 

KVPAPEQVEKGGVSNVTEAKLI VFLTS IFVKAGCSPS DIG I IAP 

YRQQLKI INDLLARS IGMVBVNTVDKYQD\RDKS I VLVSFVRSN 

KDGTVGELLKDWRRT.NVAITRAKHKLI LLGCVPSLNCYPPLEKL 

LNHLNSEKLI IDLPSREHESLCHILGDFQRE 


6025 
6026 


3977 


SB 


GGFPAQSDHLPPVFPLRSDLLITMSTLiYVSPHPDAFPSIiRAIjIA" I 

ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 

GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTELIP 

AACGATLPALGLRSSAQDPQAVLGALGRALSPLEEWLRLHTYLA 

GEAPTIADLAAVTAIiLLPFRYVLDPPARRIWNITVTRWFVTCVRQ 

PE FRA VLGE WL YSGAR PLSHQ PG P BAPALPKTAAQLKKEAKKR 

EKLBKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 

PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFKPEYGRPNVS 

AANPRGVFMMCIPPPNVTGSLHLGHALTNAIQDSLTRWHRMRGE 

TrLWNPGCDHAGlATQVWEKKLWREQGLSRHQLGREAFLQEVW 

KWKE E KGDR I YHQLKKLGS S LDWDRACFTMDP KLS AAVTEAF VR 

LHEEGI IYRSTRLVNWSCTLNSAISDI EVDKKBLTGRTLLSVPG 

YKEKVE FGVLVS FA YKVQGS DS DEE VWATTR I ETMLGDVAVAV 

HPKDTRYQHLKGKNVIHPFLSRSLPIVFDEFVDMDFG1GAVKIT 

PAHDQNDYEVGQRfTGLEAISIMDSRGALINVPPPFLGLPRFEAR 

KAVLVALKERGLFRGIEDNPMWPLCNRSKDWEPLLRPQWYVR 

CGEMAQAASAAVTRGDLRILPERHQRTWHAWMDNIRE\WCMFPG 

KLWWG\HR\IPAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 

KAAKEFGVSPDKISLQQDEDVLDTWFSSGLFPLSILGWPNQSED 

LS VFYPGTLLETGHDI LFFWVARMVMLG LKLTGRL P FRE VYLHA 

I VRDAHGRKMSKSLGNVIDPLDVI YGI SLQGLHNQLLNSNLDPS 

EVEKAKEGQKADFPAGIPECGTDALRFGLCAYMSQGRDINLDVN 

R ILGYRHFCNKLWNATKFALRGLGKGF VPS PTSQPGGHESLVDR 

WIRSRLTEAVRLSNQGFQA YDFPAVTTAQ YS FWL YELCD VYLE C 

LKPVLNGVDQVAAECARQTLYTCLDVGIiRLLSPFMPFVTEELFQ 

RLPRRWPQAPPSLCVTPYPEPSECSWKDPEAEAALELALSITRA 

VRP\U^YNLHPESGPTCFLEVAD\EATGALASAVSGYVQGPG 

QAQWVAVAEPWGLPAP\QGCAVAXASDRCSI\HLQLQG\LLDP 

ARELG\KLQ\AKRVEAQ\RQAQ\RLR\ERRA\ASGIiPVXVPL\E 

VQEADEAKLQQTEAELRKVDEAIALFQKML [ 




2674 


514 

1 


GPITFLKKKAKMKDMPLRIHVLLGIAITTLVQAVDKKVDCPRLcH 
TCE I RPWFTP RSI YMEAS TVD CNDI/3LLTF P ARLPANTQI LLLQ 
TNNIAKIEYSTDFPVNLTGLDLSQNNLSSVTNINGKKMPQLLSV 
Y LEENKLTEL PEKCLS ELSNLQ E L YINHNLLST I SPGAF IGLHN 
LLRLHLNSNRLOMINSKWFDALPNLEILMIGENPIIRIKDMNFK 
PLINLRSLVIAGINLTEIPDNALVGLENLESISFYDNRLIKVPH 
VALQKVVNLKFLDLNKNPINRIRRGDFSNMLHLKELGINNMPEL 
ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESLKL 
^SNALS AL YHGTI ES L PNLKE I S I HSNP I RCDCVI R WMNMNKTN 
IRFM EPDS L FCVDPPE FQGQNVRQ VHPRDMMEI CLP L IAPES FP 
3NLNVE AGS YVS FHCRATA\ E PQ PE I YW I TP SGQKLLPNT\ LTD | 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 

I location 

1 corresponding 
to first 
amino acid 
residue of 

1 amino acid 
sequence 


Amino acid, segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
uiutamic Acia, v- pnenyl alanine , G=Glycine, 
H=Hietidine, I=Iaoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q^Glutamine, R*=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosin©, X-Unknovm, *«stop 
Codonj /=oossible nuelf»r>t- •> j fl i 0 n.„ 
\=possible nucleotide insertion) ! 








KFYVHS EGTLDINGVTPKEGGLYTCI ATNLVGADLKSVMIKVDG1 
SFPQDNNGSLNIK1RDIQANSVLVSWKASSKILKSSVKWTAPVK 
TENSHAAQSARIPSDVKVYNLTHLNPSTE YKIC1 DI PTIYQKNR 
KKCVTW^KGLHPDQKEYEKNNTTTLMACLGGLLGr IGVI CLI S 

CLSPEMNCDGGHSYVRNYLQKPTFALGELYPPLINLWEAGKEKS 
TSLKVKATVIGLPTNMS I 


6027 
~ S028 


525? 


j 4148 


ggrrapgrpgrsikdeeebtvfrewsfspdplpvryydkdiTki 

llchdmmggylddrfiqgswqtpyafyhwqcidvfvyfshhtv 
tippvgwtntahrhgvcvlgtfitewneggrlceaflagdersy 

QAVADRLVQIT\RFFRFDGWLINIENSLSLAAVGNMPPFLRYIjT 

tqlhrqvpgglvlwydswqsgqlkwqdelnqhnrvffdscdgf 

ftnynwreehlermlgqagerradvyvgvdvfargnvvggrfdt 

djcvgggfrprasgpvpplgphflmdlpfpsapqrndsscssqsg 
dpvalrnrcpapaklcph 


6029 


120 


j 3432 


NCLiLLQAKGMiGE I EDLQQWLTDTERHLLAS KPLGGLPETAicE5~l 

lnvhmevcaapeakeetykslmqkgqqmlarcpksaetnidqdi 

NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
WLTQAEQTLNVASR PS L I LDT VL FQIDEHKVFANEVNS HREQ 1 1 
ELDXTGTH LKYFSQKQD WL I KNL LIS VQSRWEKWQRLVERGR 
SLDDARKRAKQFHEAWSKLMEWLEESEKSLDSELEIANDPDKIK 
TQLAQHKEFQKSLGAKHS VYDTTNRTGRS LKEKTSLADDNLKLD 
DMLS2LRDKWDTI CGKSVERQNKLEEA\LI,FSGQFTDALQALID 
WLYRVEPQLAEDQP VHGDI DLVMNL I DNHKAFQKELGKRTSSVQ 
AIiKRSARBLIEGSRDDSSWVKVQMQELSTRWETVCALS ISKQTR 
IiEAAIiRQAEEFHSVVHALLEWLAEAEQTIiRFHGVLPDDEDALRT 
LIDQHKEF^KLEEKRAELNKATTMGiyiVIAICHPDSITTIKHW 
IT! IRARFEEVLAWAKQHQQRLASALAGLIAKQELLEALLAWLQ 
WAETTIiTDKDKEVI PQE IEEVKALIAEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHI PVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 
NFDFDIWRKKYMRWMNHKKSRVMDFFRRIDKDQDGKITRQEFID 
GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 
XAFl riJAJJKIEDEVTROVAKCKCAKRFQVEQIGDNKYRFFLGNQ 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFLVKNDPCRAKGRT 
NKELREKFILADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SOAAQAAS PQ VPATTTP K I LHPLTRIJ YGKPWLTNSKMS TP CKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKOFHSGEDSGLITTAA 
ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 

IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSKR J 


- 


1 j 


3533 

I 
I 


IMPCGSSRDLRGCWTHPNEPVSDLSYFDCIESVMEN£KVLGESM" 
AGI S QNAKTGDLPAFGEC VG I AS KALCGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVDP IQFARANQAIQMACQNLVDPGS SPSQVIjSAA 
T I VAKHTS ALCNACR I AS S KTANPVAKRHF VQS AKE VANSTANL 
VKTIKALDGDFSEDNRNKCRIATAPLIEAVENLTAFASNPEFVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSLAINPKDP 
PTWSVLAGHSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 
IROIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFEPL I LAAVGVAS KILDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
E AVDDI M VTLNEAAS E VGLVGGMVDA I AEAMSKLDE GTPPE PKG 
TPVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGGliASQMTSD 
YGHLAFQGQMAAATAE PEE I GFQ I RTR VQDLGHGC I FL VQKAG\ 
ALQ VCPTDS YTKREL I ECARAVT EKVS I» VLSALQAGNKGTQACI 
rAATAVSGI I ADLDTT I M FATAGTLNABNS ETFADHRENTLKTA 
KALVEDT KLL VSGAASTPDK LAQAAQSS AAT I TQLAE WKLGAA 
3LGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 
fQLKGAAKVMVTNVTSLLKTVKAVEDEATRGTRALBATlECI KQ 
3LTVFQSKDVPEKTSSPEESIRMTKGITMATAKAVAAGMSCRQE 
JVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
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SEQ 
ID 
KO: 


Predicted 

beginning 

nucleotide 

location 

co r re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
HsHistidine, I«Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








7LGYLDLLEHVLVI LQKPTPELKQQLAAFS KRVAG AVTBLIQAA 
EAMKGTEWVDPEDPTVIAETELLGAAAS IEAAAKKLEQLKPRAK 
PKQADETLDFEEQI LEAAKS I AAATS AL VKSASAAQRELVAQG K 
VGS I PANAADDGQWSQGL I SAARM VAAATS SLCE AANAS VQGHA 
SEE KLI SS AKQ VAASTAQLLVAC KV KADQDSE AMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDWVKTKFVGGIAQIIAAQEEM 
LKK3RELEEARKKLAQIRQQQYKFLPTELREDEG 


S030 


3 


1777 


FPGRGSPALQLEVLICLGLMGLERALNVIiAPtFYRNIVNLLTEN 
APWNSLAWTVTSYVFLKFLQGGGTGSTGFVSNLRTFLWIRVQQF 
TSRRVELLIFSHLHELSLRWHLGRRTGEVLRIADRGTSSVTGLL 
S YLVFNVI PTLADI 1 1 GI I YFSMFFNAWFGLI VFLCWSLYLTLT 
1VVTEWRTKFRRAMNTQENATRARAVDSLLNFETVKYYNAESYE 
VERYREAI I kYQGLEWKSSASLVLLNQTQNIiVIGLGLIiAGSLLC 
AYFVTEQKLQVGDYVLFGTYIIQLYMPLNWFGTYYRMIQTNFID 
MENM FDLL KK\ ETEVKDLPGAG P FRFQKGR IE FENVH FS YADGR 
BTLQDVSFTVMPGQTLALVGPSGAGKSTILRLLFRFYDISSGCI 
RI DGODISQVTQALFRFSHWELCPKDTVLFNDT IADN IRYGRVT 
AGNDEVEAAAQAAGIHDAIMAFPEGYRTQVGERGLKLSGGEKQR 
VAI ARTILKAPGI I LLDEATSALDTSNERAIQASLAKVCANRTT 
I WAHRLSTWNADQILVI KDGCI VERGRHEALLSRGGVYADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 


1694 


LRMSENLDKSNWEAGKSKSNDSEEGLEDAVEGADEALQKAIKS 
DSSSPQRVQRPHSSPPRFVTVEELLETARGVTNMAIiAHEIVVNG 
DFQIKPVELPENSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
I KL VGE I KET LL S FLL PGHTRLRNQ I TE VLDLDL I KQEAENGAL 
DISKLAEFI IGMMGTLCAPARDEEVKKLKDIKEIVPLFREI FSV 
LDLMKVDMAN FA I S S IRPHLMQQS VE YERKKFQ E I LERQPNSLD 
FVTQWLEEASEDLMTQKYKHALPVGGMAAGSGDMPRLS P VAVQN 
YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQI,TILGAV 
LLVTFSMAAPG I S SQAD FAE KLKM I VK I LLTDMHL PS FHLKDVL 
TTIGEKVCLEVSS CLSLCGS S PFTTDKETVLKGQIQAVAS PDDP 
IRRIMESRILTFLETYLASGHQKPLPTVPGGLSPVQRELBEVAI 
KFARLVNYNKMVFCPYYDAI hS KILVRS 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDIiSKMYPQTRHPAPHQPAQPFKFTISE 
SCDRIKEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLN I EMHKQAE I VKRLNAI CAQVI P FLS QEHQQQ WQAVERAK 
QVTMAELNAIIGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAIP 
PIGSSAGLUVLSSALGGQSHLPIKDEKKHHDNDHQRDRDSIKSS 
SVS P S AS FRGAEKHRN5AD Y SS ES KKQKTEEKE IAAR YDSDGEK 
SDDNLWDVSNEDPSSPRGS PAHSPRENGIjDKTRLLKKDAP IS P 
AS IAS SS S TPS S KS K ELSLNE KSTTP VS KSNTPTPRTDAP TPG S 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 

PHHHMRVPAIP PNLTGI PGGKPAYSFHVSADGQMQPVPFPPDAL 
IGPG I PRHARQINTLNHGEWCAWISNPTOHVYTC 
DISHPGNKS PVS QLDCLNRDN YIRSCRLLPDGRTLI VGGEASTL 
S IWDLAAPTPR I KAELTSS APAC YALA I S PDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFQGHTDGAS C 1 01 SNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


241S 


AARLCRAQ PTKSAWM I RDLS KM Y PQTRHPA PHQPAQPFK FT I S E 
SCDRIKEEFQFLQAOYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGIiNIEMHKQAE I VKRLNAI CAQV I PFLSQEHQQQ WQAVERAK 
QVTMAELNAIIGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAIP 
P XGSSAGLLALSSALGGQSHLP I KDEKKHHDNDHQRDRDS I KS S 
SVS PSAS FRGAEKHRNSADYSSESKKQKTEEK3IAAR YDSDGEK 
SDDNLWDVSNEDPSS PRGS PAHS PRENGLDKTRLLKKDAPIS P 
AS I AS S S S T PS S KSKE LS LNEKS TTP VS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSIiRTPMAVPCPYPTPFGIVPHAG . 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E» 
Glutamic Acid, F«Phenyl alanine, G«Glycine, 
H-Histidinc, I«Isoleucine, K=Lysine, 
L=-Leucine, M^Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTS PGAAYAGLHN I S PQMS AAAAAAAAAAAYGRS P WGFD 
PHHI IMRVPAI PPNLTG IPGGKPAYSFHVSAD3QMQPVPFP PDAL 
IGPQlPRHARQlNTLmGEWCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCIjNRDNYIRSCRLLPDGRTLIVGGEASTL 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 
AVVTOLHNQTL^QFQGHTDGASCIDISNDGTKLWTGGLDNTVRS 

w\dlrbgrqlqqhd/fftspvfslgycp\teewlavgmensn\v 

EVLHVTKPDKYQLHLHESCVLSLKFAHCGroNfF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKSSSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


esgrrrrlkrrrspcpgtaggpgetnpgpgacprgpreeaaaam" - 
e i apqeapp vpgadgd i e eap aeags p s pas p padgrlkaaakr 

VTF PSDED I VSGAVE P KD P WRHAQSTVTVDB VI GAYKQACQKLNC 
RQIPKLLRQLQEFTDLGHRLDGLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLEQTNLDEDGASALFDMIEYYESATHLNISFNKHIGT 
RGWQAAAHMMRKTSCLQYLXDARNTPLLDHSAPFVARALRIRSS 
LAVLHLENASLSGRPLMLLATALKMNKNLRELYLXADNKIjNGLQ 
DSAQLGNLLKFNCSLQILDLRNNHVLDSGLAYICBGIjKEQRKGL 
VTL\VLWNWQLTHTGMAFLGMTI>PHTQSLETLNLGHNPIGNEGV 
RHLKNGLI SNRS VLRLGIjASTKLTCEGAVAVAEFI aes prllrl 
DL RENE I KTGGLMALS LAI»KVNHS LLRLDIJ3R E PKKEAVKS FIE 
TQKALLAEIQNGCKRNLVLAREREEKEQPPQLSASMPETTATEP 
QP DDE PAAG VQNGAPS PAPSPDSDSDSDSDGEEEEEEEGEREET 
PSGAI DTRDTGSSEPQPPPEPPRSG PPLPNGLKPEFALALPPBP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLLEASQESGQETt: 


6035 


19 


404 


S VTYLG 1 1 LHKNTGAL PAD PVQL I SQTP T P S'l'KQQLL^ FLGM VG 
YFYLWIPGFAILTKPLCKLTKENLADAIDPKSFSHSSFRSLKTA 
LENASTLALPDS SQP F \ S LHTAE VQ G CWE I LTQGLGPLPV 


6036 


1745 


356 


LP DVEKLGRRRGRKMDS VEKGAATS VSN PRGRP S RGRP PKLQRN 
SRGGQGRGVEKPPHLAALILARGGSKGI PLKNI KHLAGVPLIGW 
VLRAALDSGAFQ S VW VS TDHDE IENVAKQ FGAQVHRRSSE VSKD 
S STSLDAI I E FLNYHNE VDI VGNI QATSPGLHPTDLQKVAEMIR 
EEGYDS VPS WRRHQFRWSEI QKGVREVTE PLNLNPAKRPRRQD 
WDGELYENGSF^FAKRHLIEMGYLQGGKMAYYEMRAEHSVDIDV 
DIDWPIAEQRVLRYGYFGKEKLKEIKLLVCNIDGCLTNGHIYVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISBRACSKQTLSSLK 
LD CKME VS VS DKLAWDE WRKEMG LC WKEVAYLGNEVSDEE CLK 
RVGLS G APADACSTAQKAVG Y I CKCNGGRGA\ I REFAEHI C\LL 
MEKGLINFMPKNRNLAVNI GEKK 


6037 


2936 


1919 


WTSWWMSSVLTILLFSLQGNKMLNYSAPSAGGYLLPRKPVGTPA 
GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQKQPGGGQVNSSRYKT\ELCRPFEENGACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 
HMAEERRALAGARDLSADRPRLQHSFS FAGFPSAAATAAATGLL 
DS PTS ITP PP I LSADDLLGS PTLPDGTNNPF\AFS SQELAS LFA 
PSMOLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSLSDQEGYLS 
SSSS SHSGSDSPTLDNSRRLP I FSRLS ISDD 


6038 


1450 


426 


S SALQEFGTRNHTFGVPL PHRRKQIXSCNI CQLR FNS DSQAAAH 
YKGTKHAKKLKALEAMKNKQKS VTAKDSAKTTFTS ITTNT INTS 
SDKTDGTAGTPAISTTTTVEI RKSS VMTTEITS KVEKSPTTATG 
NSSCPSTETEEEKAKRLL\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLEARNGSGTI KAFPRAGVKGKGPVNKGNTGLQNKTFHCEICD 
VHVNSETQIiKQHlSSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFSKEPSKPLAPRIIjPNPLiAAAAAAAAVAVSSPFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY 


6039 


4073 


1000 


ldeyearltlanlddfeednedddenrvnqeekaakitelinkl"'" 
nfldeaekdlatvnsnpfddpdaaelnpfgdpdseepitetasp 
rktedsfynnsynpfkevqtpqylnpfdepeafvtikdsppqst 
kkknirpvdmskylyadsskteebeldesnpfyepkstpppnnl 

VNPVQELETERRVKRKAPAPPVLSPKTGVLNENTVSAGKDIjSTS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=»Cyeteine, D«Aspartic Acid, B= 
oiucamic Acaa, fr-tfnenyialani.ne, G^Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PeProline, Q=Glutamine, R*Arginine, 
SsSerine, T= Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6040 






PKPS P I PS P VLGRKPNAS QS LLVW CKE VTKN YRGVKI I'N FTT S W 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 
SRLiiEPSDMVLLAIPDKLrVMTYLYQIRAHFSGQELNWQIEEN 
SSKSTYKVGNYETDTNSSVDQBKFYAELSDLKREPELQQPISGA 
VDFLS QDDS VFVNDSG VGES ES EHQTPDDHLS PSTAS P YCRRTK 
S DTE PQKS QQS SGRTSGS DD PG I CSNTDS TQAQVLLGKKRLL KA 
ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENSRSLECRSDPESPIKKTSIjSPTSKLiGYSYSRDLDIiAKKKHA 
SLRQTBSDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
LEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQLIAEARSGGKMSELPSYGERAAEKLKERSKASGDENDNIEI 
ulmlcxvkasz v vijOCaDiiljTNLENDLDTPEQNSKLVDLKLKKLLB 
VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 
RKPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAAITETQRK?SEDEVLNKGFKDS\SQYWGELAAI,ENEQKQ 
IDTRAALVEKRLRYLMDTGRNTEEEEAMMQEWFMLVNKKNALIR 
RKNQLSLLEKEHDLERRYELLNRELRAMIiAI EDWQKTEAQKRRE 

QLLLDELVALVNKRDALVHDIJ3AQEKQAEEEDEHLBRTLEQVKG 
KMAKKEE KCVLQ 




475 


f 1052 


PTALMTAPSC^FPVQFRQPSVSGLSQITKSLYISNGVAANNKLM 
LSSNQI TMVINVS VEWNTLYEDI QYMQVPVADSPNSRLCDFFD 
PIADHIHSVEMKQGR\TLLHCAAGVSRSAALCLAYLMKYHAMSL 
LDAKTWTKSCRP I IRPNSGFWEQL IHYEFQLFGKNTVHMVSSPV 
GMIPDIYEKEVRLMIPL 


6041 
6042 


2 


3B86 


TEKDEKTAHNLENVLIHFWERLSEICVAKISEPEADVESVLGVS 
NLUJVLQKPKGSLKSSKKKWGKVRFADEILESNKENEKCVSSEG 
E KIE CWELTTEPS LTHNSSGLLS PLRKKPLEDLVCKLAD IS I NY 
VNER KS EQHLRFLSTLLDSFS S SR VFKMLLGDE KQS I VQAKP LE 
IAKLVQKNPAVQFLYQKLlGWLNEDQRKDFGFLVDIIiYSALRCC 
DNDMERKKVLDDLTKVDLKWNSLLKI IEKACPSSDKHALVTPWL 
KGDILGEKLVNLADCLCNEDljESRVSSESHFSERWTLI,SLVLSQ 
HVKNDYLIGDVYVER I IVRLHETLFKTKKLSEAESSDSS VSFI C 
DVAYNYFSSAKGCLLMPSSBDLLLTLFQLCAQSKEKTHLPDFLI 
CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 
DINSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSEWEK 
MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHIjCT 
SALLSKMVLI ALRKETVLENNELEKI IAELIiYSLQWCEELDNPP 

ifligfceilqkmnitydnlrvlgnmsgllqllfnrsrbhgtlw 
sliiaklilsrsissdevkphykrkesffpltegnlhtiqslcp 

FLSKEEKKEFSAQCIPALLGWTKKDLCSTNGGFGHLAIFNSCLQ 

tksiddgellhgilkiiiswkkehediflfscnlseaspevlgv 

NIEXXRFLSLFLKYCSSPLAESEWDFIMCSMLAWLETTSENQAL 
YSIPLVQLFACVSCDLACDLSAFFDSTTLDTIGNLPVNLISEWK 
EFFSQGIHSLLLPILVTVTGENKDVSETSFQNAMLKPMCSTLTY 
ISKEQLLSHKLPARLVADQKTNLPEYWTLLNTIAPLLLFRARP 
VQIAVYHMLYKLMPELPQYDQDNLKSYGDEEEEPALSPPAAIjMS 
ajuo j. v c. uiijj cum v i/j l X V VCiOI VTI KPLSEDFCYVLG YH/TWKI*! 

LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRIJ^PENPTYAB 
TAVEVPNKDPKTFFTEELQLSIRETTMLPYHIPHIACSVYHMTI. 
KDLPAWVRLWWNSSEKRVFNIVDRFTSKYVSSVLSFQE1SSVQT 
STQLFNGMTVKARATTREVMATYTIEDIVIELIIQI.PSNYPLGS 
I IVESGKRVGVAVQQWRNWMLQLSTYLTHQNGS IMEGLALWKNN 
VDKR FEG VEDCM I C FS VI HGFN YS LP KKACRTCKKKFHSA \ CL Y 
KWFTSSNKSTCSLCRETFF 




1306 


253 

( 
( 


MAEIAPASPSDI KA5VSNGDrTLLCSRkQSCGMN£VRQVSLTYP~ 

SS PAPSHSLPLQ PRSGGS LCPSRAW/PDPHQLFDDTSSAQSRG Y 

3AQRAPGGLSYPAASPTPHAAiTLADPVSNMAMAYGSSrj^QGKE 

LVDKNIDRFIP ITKLKYYFAVDTMWGRKLGLLFFPYIjHQDWEV 

3YQQDTPVAPRFDVNAPDLYIPAMAFITYVLVAGLALGTQDRFS 

PDLLGLQASSALAHLTLEVLAII^SLYLVTVNTDLTTIDLVAFL 

3YKWGMIGGVLMGLLFGKIGYYLVLGWCCVAIFVFMIRTLRLK 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cyeteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
friuiane, y-uiucaraine, R=Arginine, 
S=3erine, T=Threonine, V«Valine, 
^Tryptophan, Y« Tyro sine, X=Unknown, *«Stop 
WUJU » / 'puaaiiaxe nucieociac deletion, 
\-possible nucleotide insertion) 


6043 


403 


599 


Z XiADAAAEG VP VRGARNQLRMY LTMAVAAAQPMK^iYWLTFH L VR 
LCLFFPFPCATPVLPLPSLISAt,/CLSHLsVSSWFCPCQPPLPC 
PLP PLQNKTAKGSLSTEQS ERG 


6044 


793 


412 


" KLfeMWNFTtiSKVKISREVTMIASKFGIGQQVRHSLLGYLGVW 
DIDPVYSLSEPSPDELAVNDELRAAPWYHWMEDDNGLPVHTYL 
AEAQLSSELQDEHP\EQPSMDELAQTIRKQLQAPRLRN 


6045 


155 


2299 


S PLPQ VAAMN YIjRRRL S DSNFMANLPNG YMTDLQR PQ PP PP PPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFS EQVGGGSGGAGRGGAASRVLLVI DEPHT 
EWAK YFKG KKIHGBIDI KVBQAEFS DLNL VAHANGG FS VDME VL 

rngvkwrslkpdfvlirqhafsmarngdyrslviglqyagips 

VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
BMLSS\TTYPVWKMGHGTLWGWGKVKVDNQHDFQDIASVVALT 
KTYATAEPFIDAKYDVRVQKIGQNYKAYMRTSVSGNWKTNTGSA 
MLE Q I AMS DR YKLWVDTCS E I FGGLD I CAVEALHG KDGRDH HE 
WGS S MP L I GDHQDED KQL I VELVVNKMAQALPRQRQRDASPGR 
GSHGQTPS PGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPG P 
QRQGPPLQQRPPPQGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
AS QAAPPTQGQGRQSR PVAGGPGAPPAARPPAS PS PQRQAGPPQ 
ATRQTS VSGPAP PKAS GAPPGGQQRQGP PQKP PGPAGPTRQAS Q 
AGPVPRTGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 
AAGG P PH PQLNKSQS LTNAFNIj PE PAPPRPS LSQDE VKA ETI R S 
LRKSFASLFSD 


6046 


212 


1075 


E G LTG ?CER VP FLLG RG P PHGATRAGHRRAVRWAGPES L PPLPR 
SLIMDSPRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAPLS PGLPAMGGPGPGP CEDPAGAGGAGAGGSEPLVTVTVQCA 
F TVAL RARRGADLSSLRALLGQALPHQ \AQLGQLS YLAPGEDGH 
WVP 1 P E EES LQRAWQDAAACPRGLQLQCRGAGGRP VL YQ WAQH 
SYS AQGPEDLG FRQGDTVDVLCE VDQAWLEGH CDGR IG I FPKCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


PVLVTSLRMREADTLRPPQLMEVSADIISTVEFNHTGELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDBEGKLKJ}LSTVTSLQVPVLKPMDLMVEVS PRRIFANGHTYH 
I NS I S VNS DCETYMSADDLRINLWHLAITDRS FTP \NI VD I KPA 
NMEDLTE VI TAS E FHPHHCNLFVY S S S KGSLRL CDMRAAALCDK 

hski»feepedpsnrsffseiis\svsdvkfshsdrymltr\dyl 
tvkvwdl \ nme arp ietyqvhdylrs klcsl y endci fdkfe ca 

WNGSDSVIMTGA\YNNFFRMFDRNTKRDVTL\EASRESSKPRAV 
LKPRRVC VGGKRRRDD I S VDSLDFTKKI LHTAWHPAENI 2 A IAA 
TNNLYIFQDKVNSDMH 


6048 


1 


3194 


G I RT P KF CDS PTS DLEMRNGRGRGKRMR PNSNTPVNETATASDS 
KGTSNS S KTRAGANSKGRRGS QNSS EHR P PASS TS ED VKAS PS S 
ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
EPTVLDRNCPS P VLIDCPH PNCNKKYKHINGLKYHQAHAHTDDD 
SKPEADGDSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 
PKVRLVE PHSPSP S SKFSTKGLCKKKLSGEGDTDLGALSNDGS D 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 
SARPI / APLAI PPQQI YTFQTATFTAASPGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 
LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGLLNGSSDPHQSR 
LASIKAEADKIYSFTDNAPSPSIGGSSRX.ENTTPTQPLTPLHVV 
TQNGAEASS VKTNSPAYSDISDAGEDGEGKVDSVKS KDAEQLVK 
EGAiOCTLFPPQPQSKDSPYYQGFESYYSPSYAQSSPGALNPSSQ 
AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 
I QQRPKM YMQS L Y YNQ YAYVPP YGYS DQS YHTHIiL S TNTAYRQQ 
YEEQQKRQSLEQQQRGVDKKAEMGIjKEREAALKEBWKQKPSIPP 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVI IPKLDDSSKLPGQ 
ftPEGLKVKLSDASHLSKEASEAKTGAECGRQAEMDPILWYRQEA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aapartic Acid, E= 

Glutamic Acid. FnPhpm/lalan^o r> r<~\ r 

* ^ *• "tiicixyiaianiiiei o— Glycine. 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V=v a line, 
W=Tryptophan, Y«Tyrosine, XoUnknown, *«Stop 
Codon, /=possible nucleofcirl#» Hoi t»f i ^ 
\=possible nucleotide insertion) 


• 






EDGKESTSSDCKLPTSEESRLQSKEPRPSVHVPVSSPLTQHOSY^ 

I PYMHG YSYSQS YDPNHPS YRSMPAVMMQNYPGS YLPSSYSFS P 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 

SPTISDKTSQERDRGGCGVVGGGGSCSSVGGASGGERSVDRPRT 

SPSQRLMSTHHHHHHLGYSLLPAQYKLPYAAGLSSTAIVASCOG 
STPSLYPPPRR 


6049 


215 


1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQESPTIiPESSAT | 

DSOYYSPTGGAPHGYCSPTSASYG\KALNPYQYQYHGVNGSAGS 

Y PAKAYAD YS YAS S YHQ YGGA YNR VPS ATNQPE KE VTE P EVRMV 

NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 

GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 

QS P AVWE PQG SSRSLSHHPHAHPPTSNQS PAS SYL ENS AS WYTS 

AASSINSHLPPPGSLQHPLALASGTLY 


6050 


566 


1718 


kglertccameesdsekttekenlgpr>udppi^epg\gsi^wviH 

m.™ j-»ajj t xoi\£>vjoiji\.i oiTiKoixrAWl 1ARDTRRLGATILD 1 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFIiGNIjVLNLW 
DCGGQDTFMENYFTSQRDNIFRWVEVLIYVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKI FCLVHKMDLVQEDQRDLI FKBREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSS1VYQLIPNVQQLEMNLRN 
FAE I I EADEVLLFERATFLVI SH YQCKEQRDAHRFEKI SNI IKQ 
FKLSCSKXAASFQSMEVRNSNFAAFIDIFTSNTYVMVVMSDPSI 
PSAATLINIRNARKHFEKLERVPGPKQCLLMR j 


6051 


566 


1718 


kglertccameesdsekttekenlg?rmdppi^epg\gsi^wviH 

fniHi i-^f^x v uum\3K&\3&\q& i oMRS 1 I FAN YI ARDTRRLG ATJLD 1 

rihslqinsslstyslvdsvgntktfdvehshvrflgnlvlnlw 
dcggqdtfmenyftsqrdnifrnvevliyvfdvesrelekdmhy 

YQSCLEAILQNSPDAKI FCLVHKMDLVQEDQRDLI FKBREEDLR 
RLSR PLECSCFRTS IWDETLYKAWSS I VYQL I PNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVISHYQCXBQRDAHRFEKISNIIKQ 
FKLS CS KLAAS FQ S ME VRNSNFAAFI DI FTSNT YVMWMS DPS I 

psaatlinirnarkhfeklervdgpkqcllmr S 


6052 


566 


1718 


KGLERTCCAMEE SDSEKTT E KENLGPRMD P PLGE PdJ \GSLG W VL 1 

iA.>n vaJAii'J WVOUoijAI oRLKo 1 1 JrAIVi JLARDTRRLGATILD 

RIHSLQINSSLS TYSL VDS VGNTKTFD VEHSHVR FLGNLVLNLW 
DCGGQDTFMENYFTSQRDNI FRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIFKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAEI IEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI IKQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDI FTSNTYVMWMSDPSI 
PSAATLINI RNARKHFEKLER VDGPKQCLLMR ( 


6053 
6054 


201 


1704 


KOTEMNKSR^QSRRRHGRRSHQQNPWFRLRDSEDRSDSRAAQPAI 
HDSGHGDDES PS TS S GTAGTS S VPE LPG FYFDP E KKR YFRLL PG 
HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 

LRKSQIiGFLNVTNYCHLAHELRLSCMERKKVQIRSMDPSALASD 
RFNLILADTNSDRLFTVNnvn/nrjqwnT tmt/^ct vpt>t»t t/wr.,, 1 

H ENL Y FTNRKV \NS VCWAS LNHLDSH I LLCLMGLAE TPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TGLSRR VLLTNWTGHRQS FGTNSDVLAQQ FALMAP LLFNG CRS 

G2IFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYLMASD 
MAGKIKLWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGILVAVG 
QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLMAVGQDLYCYSYS | 




1 


1054 1 " 


PPIARLQEFGTo'RKHMAAPSGVHLLVRllGSHRIFSSPLNHiYLH 1 
KQSSSQQRRNFFFRRQROISHS I VLPAAVSSAHPVPKHI KKPDY 
\TTTGIVPDWGDSIEVICNEDQIQGLHQACQIARHVLLLAGKSLKV 
DMTTBEIDALVHREI ISHNAYPSPLGYGGFPKSVCTSVNNVLCH 
3 1 PDSRPLQDGDI INIDVTVYYNG YHGDTSETFLVGNVDECGKK 
LVEVARRCRDEAIAACRAGAPFSVIGNTISHITHQNGFQVCPHF 
VGKQIGS YFHGHPE I WHHANDSDLPMEEGMAFTIEPI ITEGSPE 
F-KVLEDAWTWSLD/TSKVSAQFEHTVLITSRGAQILTKLPHEA 



437 



WO 01/53312 



PCT/US00/34263 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



SEQ 
ID 

NO: 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide" 
[A=Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H*Histidine, I»Isoleucine, K«Lysine, 
LoLeucine, MsMethionine, N»Asparagine, 
P- Proline, Q=Glut amine, R=Axginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



"605T 



43 



3358 



P P Y FLLS FLAW WLYGQS DRTBTD I SQSAGPPPGTIiQCSAIiHHD P 
GCANCSRFCRDCSPPACQCHTHVPPGNALNGVQPPELSRTLALI 
SS RE P PRKKKKSQTETG KE RERTS FLTQGGKRFELQHG LAG I CM 
TLLITGDS I VSAEAVWDHVTMANRJELAFKAGDVIKVLDASNKDW 
WWGQIDDEEGWFPASFVRLWVNHEDEVEBGPSDVQNGHLDPNSD 
CLCLGRPLQNRDQMRANVINEIMSTERHyiKHLKDICEGYLKQC 
RKRRDMFSDEQLKVIFGNIEDIYRFQMGFVRDLEKQYNNDDPHL 
SEIGPCFLEHQDGFWIYSEYCNNHLDACMELSKLMKDSRYQHFF 

eacrllqqhidiaXidgflltpvqkickyplqlaellkytaqdh 

SDYRYVAAALAVMRNVTQQINERKRRLENIDKIAQWQASVLDWE 
GED ILDRSSELI YTGEMAW I YQ P \ YGRNQQRV FFLFDHQMVLCK 
KDIiIRRDILYYKGRIDMDXYEWDIEDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMTVRKVPKQKGVNSARSVPPSYPPPQDPLMIGQYLVP 
\DGIAQ5QVFEFTEPKRSQSPFWQNFSRLTPFKK 



SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
IAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEBFREFNK 
YIA YMESKGAHRAGLAKVI P PKEWKPRQCYDDIDNliLI PAP IQQ 
MVTGQSGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ER KY WKNLTFV AP I YGAD INGS I YDEG VDE WN I ARLNT VLDWE 
EECGISIEGVNTPYLYFGMWKTTFAWHTEDMDLYSINYLHFGEP 
KSWYAIPPEHGKRXjERLAQGFFPSSSQGCDAFLRHKMTLISPSV 
LKKYG I ? FDKI TQEAGEFMI TFPYGYHAGFNHGFNCAESTNFAT 

vrwidygkvaklctcrkdmvkismdifvrkfqpdryqlwkqgkd 

IYTIDHTKPTPASTPEVKAWLQRRRKVRKASRSFQCARSTSKRP 

kadeeeevsdbvdgaevpnpdsvtddlkvsekseaavklrntea 
sseeessasrmqveqnlsdhiklsgnsclstsvtedikteddka 
yayrsvpsisseaddsiplstgyekpexsdpselswpkspescs 
svaesngvltegeesdveshgnglepgeipavpsgernsfkvps 
iaegenktskswrhplsrpparspmtlvkqqapsdeelpevlsi 

EEEVEETESWAKPLIHLWQTKPPNFAAEQBYNATVARMKPHCAI 

ctllmpyhkpdssneendarwetkldevvtsegktkplipemcf 
iysebnieysppnafleedgtslliscakccvrvhascygipsh 
e icdg wlcarckrnawtae cclcnlrggalkqtknnkwahvmca 
vavpbvrftnvpertqidvgri plqrlklkcifcrhrvkrvsga 
ciqcsygrcpasfhvtcahaagvl\mepddwpywnitcfrhkv 

NPNVKS KACEKVI S VGQTV I TKHRNTR Y YSCRVMAVTSQTFYE V 

mfddgsfsrdtfpedivsrdclklgppaegbwqvkwpdgki,yg 

AKYFGSNtAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 

vsagrchlgtcqvnslssphvsoaqqetylgfwinskxsqcnif 

LSGTY 



6058 



"98T 



6059 



3650 



fvarlkeqegegglgprkekgrargrerrrkmqltrccfvflvq 

GS L YLVI CGQDDG PPGS ED PERDDHEGQ PRPR VPR KRGH I S PKS 

rpmanstli^liappgeawgilgqppnrpnhspppsakvkxifg 

WGDFYSNI KTVALNLLVTGK1 VDHGNGTFSVHFQHNATGQGN IS 

islvppskavefhqeqqifieakaskifnc\rmewekve\rgrr 

TS LFTHDPAKI CS RDHAQS SATWS CSQP FKWCVY I AF YSTD YR 
LVQKVCPDYNYH5DTPYY PSG 

HPLP3ASLGLPSVSLGVSLCVRSALLEA WPMLPKRRRARVGS P " 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 

vldacsseathwmeetsaeeavswqerrmaaappgctppalld 

ISWLTESLGAGQPVPVECRHRLEVAGPSKGPLSPAWMPAYACQR 

P7PLTHHN1X3LSEALE riiAEAAGFEGSEGRLLTFCRAASVLKAL 

PS PVTTLSQLQGLPHFGEH3S RWQELLEHGVCEEVERVRRSE / 

RLFTQIFGVGVKTADRWYREGLRTLDDLREQPQKT.TQQQKAGBP 
5 RE AG PWA3 LNCTLDPS AS TP 



QQDt^I.ADLTDHRAHRCPGDGDDDPQbSWVASSPSSKDVA^PT 
QMIGDGCDLGLGEEEGGTGLPYPCQFCDKSFIRLSYLKRHEQIH 
SDKLPFKCTYCSRLFKHKRSRDRHIKLHTGDKKYHCHECEAAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
ERIAKSEKEAKKDDFMCPYCEDTF3QTEBLEKHVLTRHPQLSEK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, Ialsoleucine, K°Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y^Tyxosine, X=Untaiown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADbQCIHCPEVFVDENTLLAHIHQAHANQKHXCPMCPE\QFSSV 
\EG VYCHLDS HRQPDS 3NHS VS PD P VLGS VAS MS S ATPDS S AS V 
ERGSTPDS TLKPLRGQKKMRDDGQGWTKWYS CPYCSKRDFNSL 
AVLEIHLKTIHADKPQQSHTCQICLDSMPTLYNLNEHVRKIiHKN 
HAYPVMQFGNISAFHCNYCPEM FAD I NSLQEH IRVSHCGPNANP 
SDGNNAFFCNQCSMGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 
VQ PTQSFME V YS CP YCTNS PI FGS ILKLTKHI KENHKNI PLAHS 
KKSKAEQSPVSSDVEVSSPKRQRLSASANSISNGEYPCNQCDLK 
FSNFESFQTHLXLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVDD\LQKH\LLDMPHPLCCTHGT\L 
CQEVFDS\KVSI\QVHliAVKHSNEKXMYRCTACNWDFRKEADIiQ 
VHVKHSHLGNPAKAHKCI FCGET FS TEVELQCH ITTHS KKYNCK 
FCSKAFHAI ILIjEKHLREKHCVFDAATENGTANGVPPMATKKAE 
PADLQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 
VLLQNHRLRDHNI RPGEDDGS RKKAE FIKGSHKCNVCS RTF FS E 
NGIJIEHLQTHRGPAKHYMCPICGERFPSLLTLTEHKVTHSKSLD 
TGTCR I CKMPLQSEEE F I EHCQMHPDLRNS LTGFRCWCMQTVT 
STLELKIHGTFHMQKLAGSSAASSPNGOTLQKLYKCALCLKEFR 
SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGLAPPEPADRPCA 
GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQC I KCQMTFENE R E I Q IHVANHM 1 EEGINHECKLCNQM 
FDSPAKLLCHLIEHSFBGMGGTFKCPVCFTVFVQANKLQQHIFA 
VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


S Y E I VG KNKLE VNHSQlLKALCKCS LPSRLL PLGEN LPLLDRG FR T 

KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 

DISASRPNILLLMADDLG IGDIGCYGWNTMRTPNI DRLAEDGVK 

LTQH I S AASLCTPS RAAFLTGRYPVRSGMVS S IGYRVLQWTGAS 

GGLPTNETT FA K I LEEKG YATGLI GKWHIiGLNCES ASDHCH HPL 

HHG FDHF YGM P FS LMGDCARWE LSE KRVNLEQKLNFLFQ VLALV 

ALTLVAGKLTHLIPVSWMPVIWSAIfSAVLLLASSYFVGALIVHA 

DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFV 

S FLHVHI PLITMENFU3KSLHGLYGDNVKEMDWMVGRILDTLDV 

EGLSNSTLIYFTSDHGGSLENQLGNTQYGGWKGIYKGGKGMGGW 

EGGIRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTVVRLAGSEVP 

QDRVIDGQDLLPLLLGTAQHSDHEFLMHYCERFLHAARWHQRDR 

GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKVVHHDPPLLFD 

LSRDPSETHILTPASEPVFYOVMER\VQQAVWEHQRTLSPVPLQ 

LDRLGNIWRPWLQPCCGPFPLCWCLREDDPQ 


6061 


110 


133 0 


MNIHMKRKTIKNINTFENRMIiMLDGMPAVRVKTELLESEQGSPN 
VHNYPDMEAVPLLLNNVKG E P P EDS LS VDHFQTQTE PVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRIiASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLQ 
SNKLSHVHRIPWVQSVPWYTAVRSPGNVNNTIWPLLEDGRG 
HGKAQMDPRGLS PRQS KSDS DDDDLPNVTLDSVNETGSTALS IA 
RAVQEVHPS PVSRVRGNRMNNQKFPCS ISPFSIESTRRQRTVLN 
PPDSRKTAYSTDCDF\EGLQQKLYTKSSSPGRVHRRTHTGEKPY 
KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


6062 . 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSIjKICGLVFGILALT 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEIFRSGNGTDETLEVHDFKNGYTGIYFVGLQKCFIKTQIKVIP 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LEI CDNVTMYW\ INPTL \ ISGTFAKQLHHNFAFI ILVSELQDFE 
EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENGIEFDPMLDERGYCCIYCRRGNRYCRRVCEPIiLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNGPENCEDCHIIjNAEAFKSKKICKSLKICGLVFGILALT 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEIFRSGNGTDETLBVHDFKNGYTGIYFVGLQKCFIKTQIKVIP 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LBICDNVTMYW\INPTlA I SGTFAKQLHHNFAF I IL VS ELQDFE 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Animo acxd segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L«Leucine, M«Methionine, N=Asparagine, 
r-nuiine, u^wxucamine, RsArgimne , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








i-uuuiiwu- rru^aivM^xaywiiy w v vrtj VKV1SK.TRHARQASEEELP 
INDYTENG I E FDPM LDERGYCC I YCRRGNR YCRR VC E PLLG YYP 
YPYCYQGGRVICRVIMPCKWWVARMLGRV 


6064 


913 


311 


HLPQSLPRPTEHSPPYSLEKIVttfb^VAVWDVALSDGVHKrEFEHG 
loUMVV 1 vu^AohiXKlvJiwWr KiiVGKBTFYVGAAKTKATINID 
AISGFAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LE KDAMD VWCNGKKL ETAGE FVDDGTETHFS t GTH\AC Y I KAV\ 
SSG \ KRKEG I IHTLI VDNRE I PE I AS 


6065 
6066 " 


1153 


641 


MSVRVARVAW VRGLGAS Y RRG AS S FP V P P PG AQGVAE LLRDATG ' 
AEEEAPWAATERRMPGQCSVLLFPGQGSQWGMGRGLLNYPRVR 
ELYAAARRVLGYDLLELSLHGPQETLDRTVHCQPAIFVASLAAV 
EKLHHU3PSVIENCVAAAGFSVGEFAALVFAGAMEFAEG 


60*7 


*8 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCX3SDGDVRIW 

EDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 

VPDGILTRFTTNANHWFNGDGTKIAAGSS D\ FLVKI VDVMDSS 

QQKTFRGKDAPVLSLSFDPKDI FLASASCDGSVRVWQISDQTCA 

ISWPLLQKCNDVINAKSICRLAWQPKS3KLLAIPVEKSVKLYRR 

FSWSHQFDLSDWFISQT^IVTWSPCGQYLAAGSINGLIIVWNV 

ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGIiLENV 

CDPSGiCTSSSKVSSRVEKDYNDLFDGDDMSNAGDFLNDNAVEIP 

S FS KG I INDDEDDED LMMASGRPRQRS HILBDDENS VD IS MLKT 

GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 

GSTPLHLTHRFMVWNS IGIIRCYNDEQDNAIDVEFHDTSIHHAT 

HLSNTLNYTIADLSHEAILLACESTDELASKLHCLHFSSWDSSK 

EWI IDL PQNED I EAICLGQGWAAAATS ALLLR L FTIGGVQKE VF 

SLAGPWSMAGHGEQLFIVYHRGTGFDGDQCLGVQLLELGKKKK 

QXLHGD P LPLTRKS YLAW I G FS AEGTP C YVDSEG I VRMLNRG LG 

KITWTPICNTREHCKGKSDHYWWGIHENPQQLRCIPCKGSRFPP 

TLPRPAVAILSFKLPYCQIATEKGQMEEQFWRSVIFHNHLDYLA 

KNGYEYEESTKNQATKEQQELLMKMIALSCKLEREFRCVELADL 

MTQNAVNLAI KYASRSRKL I LAQKLS ELAVEKAAE LTATQVE EE 

EEEEDFRKKLNAGYSNTATEWSQPRFRNQVEEDAEDSGEADDEE 

KPEIHKPGQNSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 

SKEPAMSMNSARSTNILDNMGKSSKKSTALSRTTNNEKSPIIKP 

LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENIiKNVLSETPAI 

<-Fi^NliNQRPKTGFQMWLEENRSMLSDNPDFSDEADIIKEGM 

IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRVVDESDETEN 

QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 




858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKTALLQDGRRKVi?YXF — 
PDGKEMAE E YDEKTS E LLVRKWR VKSALGAMGQWQ LE VGDPAPL 
GAGNIGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 
S VDQKERC 1 1 VRTTNKKYYKKFS I PDLDRHQLPLDDAIiLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNIiSDSGEEP 
RG BAEAPHHGTGHPES AGEHALE P PAPAGAS AS TP P PPAPEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSOSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQTVPSSGTNGVSLPAJDCTGAVPAASPDTAAW 

rspseaadevcaleekepqknessnaseeeacekkdpatqqafv 

FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 
SLENSTNSADASSNKFVFGQNMSERVtiSPPKLNEVSSDANRENA 

aaesgsesssqeatpekeslaesaaaytkatarkcllekvevit 

GEEAESNVLQMQCKLFVFDKTSQSWVERGRGLLRLNDMASTDDG 
TLQSR LSD AGPRGSLR \ L I LNTKLWAQ MQ I DKAS EK\ S 1 Rl TAM 
DNEDQGVKVFLISASSKDTGQVYAALHHRILALRSRVEQEQEAK 

MPAPEPGAAPSNEEDDSDDDDVLAPSGATAAGAGDEGDGQTTGS 
T 


6069 


563 ' 


27 


PTRPGQAGSSSAMAAQRLGKRVLSKLQSPSRARGPGGSPGGLQK 
RHARVTVKYDRRELQRRLDVEKWIDGRLEELYRGMEAEMPDEIN 
IDELLELESEEERSRKIQGLLKSCGKPVEDFIQELLAKLQGLHR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, Fa Phenyl alanine, G°Glycine, 
H>=Kistidine, I-Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine , 
P= Pro line, Q=Glut amine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q \ PGLRQPS PS P \DGQPSAPFQG PGARTAS PLTLLALPPGP PER 
RPALLCVLSCI 


6070 


478 


858 


I RVTVDGE PLH Y I F PLQ FLDS PE W/ RFTETHRGRHF \Q VTLTAE 
TDCRYV5WRRKKLYLL FAQHRYI SRLFSVLIGSDI ADKLYALND 
RVYIGKRYKYDIRLPNFTQMSTPEIRRSPLTQHFQNSRRYW 


6071 


2 

• 


1654 


HEARTKGNMALARP \Vf^FSLVTRLLLAPRRGLTVRSPDEPLPV 

vripvalqrqleqrqsrrrnlprpvlvrpgpllvsarrpelnqp 
arltlgrweraplasqgwksrrarrdhfsieraqqeapavrkls 
skgsfadlgawkprvlhalqe\aapewq\pttvqsstipsllr 

GRHWCAAETG S G KT LS Y T .TjPTjT^OT? TiT £1 \ HP 3 T Xi ^ r .P T P A PPf3 T . 
VLVPS R3 LAQQ VRAVAQP LGRSLGLLVRDLEGGHGMRR I R LQ LS 
RQPSADVLVATPGALWKALKSRLI SLEQLSFLVLDEADTLLDES 
FLELVDYI LE KSH I AEGPADL EDP FN P KAQLVLVGATFPEGVGQ 
LLNKVAS PDAVTTI TSSKLHC IMPHVKQTFLRLKGADKVAELVH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWLGYILDDHKIQHLRL 

0G0MPALMRVG1 FOSFOKS SRDTT.TiCTDT A5!RRT.nQTT\7T?T.W>I 

YDFP PTLQDY IHRAGR VGRVG SEVPGTVIS FVTH P WDVSLVQKI 
ELAARRRRSLPG LASS VKE Pli PQ AT 


6072 


1 


742 


KMERTEMMPT INSQLEFKSKP FPL VS SS RWLVKRG ELTA Y VEDT 
VLFSRRTS KQQVYFFLFNDVLI ITKKKSEES YNVNDYSLRDQLL 
VESCDNEELNSSPGKNSSTMLYSRQSSASHLFTLTVLSNHANEK 
VEM LLGAETQSERARW I TALGHSSGKPPADRTS LTQVE IVRSFT 
AKQPDELSLQVADWLI\YQRVSDGWYEGER\LRDGERGWFPME 
CAKE ITCQAT IDKNVERMGRLLGLETNV 


6073 


620 


B60 


PCRRGLARPLSRRPG/ S ILVHCAVGVSRSATLVLAYLMLYHHLT 
LVEATKKVKDHRGllPNKfiKr.Rnr.r.ar.nRPT.pnnT.PA 


6074 


16B 


1110 


PGARCMATELQCPDSMPCllNQQVNSASTPSPEOLRPGDLILDHA 
GGNRASRAKVILLTG YAHS SLPAELDSGACGGSSLNS EGNSGSG 
DSSSYDAPAGNSFLEDCELSRQIGAQLKLLPMNDQIRELQTIIR 
DKTASRGDFMFSADRLIRLWEEGLNQLPYKECMVTTPTGYKYE 
G VKFEKGNCGVS IMRSGEAMEQGLRDCCRS IRIGKILIQSDEET 
0 RAKVY YAKFP PD I YRR KVLLM YP I LOTG \ NTVI E AV K VT . T F Hf? 
VQPSVI ILLSLFSTPHGAKS I IQEF PKI TI LTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


P?TCQPQEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT 
KLGPE IERAECTIRMNDAPTTG YSADVGNKTTYRWAHSS VFRV 
LRRPQEFVNRTPBTVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VE LCDHVHVYGMVP PN YCSQRPRLQRM PYHYYEP KGPD E CVT YI 
QNEHSRKGNHHRFITEKRVFSSWAQLYGITFSHPSWT 


""6076 


1721 


107 


HPS PTEAPRVQHLTMbCTWR I LFXVAAATGTHAQV^LVQSGABV 
KXPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAFDPED 
GET I YAQKFQGRWMTE DTS TDTAYMELS S LRSEDTAVY YCATD 
HGDYAFDIWGQGTMVTVSSAPTKAPDVFPIISGCRHPKDNSPW 
LACLITG YHP TS V\ TVTWYMGTOSOAVORT FP EIORRDS Y YMT <5 
SQLSTPLQQWRQGEYKCWQHTASKSKKEIFRWPESPKAQASSV 
P7AQPQAEGSLAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 
TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCF\ATGSDLKD 
AHLTWEVAGKVPTGGVE EGLLE RHSNGS Q SQHS RLTL PRS LWNA 
GTSVTCTLNHPSLPPQRLMALREPAAQAPVKLSLNLLASSDPPE 
A\ AS WLLCE VSG FS P PN I LLMWLEDHGE VNTSG FAPAR PLPKP \ 
RSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
EVSYVTDHGPMK 


6077 


3687 


1268 


llpdmnlqpifw1glissvccvfaqtdenrclkanaksc6eciq 

AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GSKDIKKNKNVTNRSKGTAEKLKPEDITQIQPQQLVLRLRSGEP 
QTFTLKFKRAED YPI DLYYLM\DLS YSMKDDLENVKSLGTDLMN 
EMRRITSDFRIGFGSFVEKTWPYISTTPAKLRNPCTSEQNCTS 
PFS YKNVLS LTNKGE VFNELVGKQRI SGNLDSPEGGFDAIMQVA 
VCGS LI GWRNVTRLL VFS TDAG FHFAGDGKLGG I VLPNDGQCHL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
<A=Alanine, d=Cysteine, D*Aspartic Acid, E= 
fciucaraxc Acaa, F=Phenyl alanine, G=Glycine, 
H-Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P=*Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHY YD Y P S I AHLVQKLSENN I QT I FAVT E E FQP VYKE 
LKNLI P KS AVGTLS ANSSNVI QL I IDAYNS LSSE V I LENGK LS E 
GVTISYQSY\CKNGVNGTGENGRKCSNISIGDEVQFEISITSNK 
CPKXDSDSFKIRPLGFTEEVEVILQYICECECQSEGIPESPKCH 
jsvjW^i r eu^ALKLJ>(£.GRVGRHCECSTDE VNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNB I YSGKFCECDNFNCDRS 
NGL I C3GKG VCKCRVCECN PNYTG SACD CSLDTSTCEASNGQ I C 
NGRGICECGVCKCTDPKFWQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYSVNGNNEVMVHVVENPECPTGPDIIPIVAGVVAG 

IVLIGIiALLLIWKLLMIIHDRREFAKFBKEKMNAKWDTGENPlY 
KSAVTTWN P KYEGK 


6078 


1426 


1B0 


KTEDVMELLEEDLTCPlCCSLFDDPRVLPCSHNFCKKCLEGILE 
GSVRNSLWRPVPFKCPTCRKKTFSYWELIPliQVNYSLKGIVEKY 
«KI KIS PKMP VCKGH\ LGQPLNI F\ CIi\ TDMQLDL/CG IC\ATR 
GEHTKHVFCSIEDAYAQERDAFESLFQSFETTORGDALSRLDTL 
BTSKRKSLQLLTKDSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQAYDP E INKLNT I LQEQRMA FN I AE AFKD VSE P I VPLQQM 
QEFREIQKVIKETPLPPSNLPASPLMKNFDTSQWEDIKLVDVDK 
LS LPQDTGTFIS KI P WS FYKLFLL I LLLGLVI VFGPTMFLE WS L 
FDDLATWKGCLSNFSS YLTKTADFIEQSVFYWEQVTDGFFI fne 

rfknftlwlnnvaefvckykll 


6079 


1*86 


141 


ATARDLGCARRIDRWME3TPSRGLNRVHLQCRNLQEFLGGLSP" 
GVLDRLYGHPATCLAVFREIiPSLAKNWVMRMLFLEQPLPQAAVA 
LWKKEFSKAQEESTGLIjSGLRIWHTQLLPGGLQGIjIIjNPIFRQ 
NLRIAULGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGS PSAAVSQDLAQLLSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKD 
YSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTBSELQIALIALFSE 
MLYPFP\NMW\ARVTR\ESVQQAIASGITAQQIIHFLRTRAHP 
VKLKQTPVLPPTITDQIR^LWEIiERDRIjRFTEGVLYWQFLSQVDF 
ELL \ LAHAPKLGVLVFB /NTPAKRLMWTPAGHSDVKR FWKRQK 
HSS 


6080 
6081 


1 


1199 


IETI DHVGEFAMAAQAAGVSkQRAATO^LGSNQNALKYLGQDFK 
TLRQQCLDSGVL FKDP E FPACPS ALG YXDLG PGS PQTQG 1 1 WKR 
PTELCPS PQFI VGGATRTDI CQGGLGDCWLLAAI ASLTLNEELL 
YR WPRDQDFQENYAG IFHFQPLCPPS? \ FWQ YGEWVE WT DDR 
LPTKNGQLLFLHSEQGNEFWSALLEKAYAKLNGCYEALAGGSTV 
EGFEDFTGGISEFYDLXKPPANLYQIIRKALCAGSLLGCSIDVY 
SAAEAEAITSQKLVKSHAYS V1X3VE EVNFQGHPEKLI RLRNPWG 
EVEKSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 

QFSRLEICNLSPDSLSSEEVHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 




3 


865 


EMLPLLLPLPLLWA/GALAQDARFRLEMPESVTVQEGLCIFVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLSIRDARRRDNGSYFFWVARGRTKFSY 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTI IPRPQDHGTNLI CQVTFP 
GAGVTTERTIQLSVSWKSGTVEEVWLAVGWAVKILLLCLCLI 
I LSFHKKKAVRAVEVEENVYAVMG 


6082 
6083 


283 
1865 


1288 
309 


EARSPG PTQTRTAPGLAAPGLAQ PAALRLLLSR P PS AAMD(jD"GD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\LDE 
L PEPLLA/ LR VLAALPRH E \ LVQACR \ LVCLRWKE LVDGAPLWL 
LKCOQBGLVPEGGVEEERDHWQQFYPLSKRRRNLLRNPCGEEDL 
EGWCDVEHGGDGWRVEELPGDSGVEPTHDESVKKYFASSFEWCR 
KAQ7IDLQAEG YWE ELLDTTQPA I WKDW YSGRSDAGCIiYELTV 
KLLSEHBNVLAEFSSGQVAVPQDSDGGGWMEISHTFTDYGPGVR 
FVRFEHGGQDSVYWKGWFGARVTNSSVWVEP 

KOWCAERRGLGMSIiADELLADLEEAAEEEEG(jSYGEEEEEPAIE 
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SEQ 
ID 
NO: 


Predicted 
beginning 

IIUClCOtlQC 

location 
corresponding 
to first 
amino acid 
residue of 
. amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - " 
(A=Alanine, C=Cysteine, D«=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H^Histidine, I=Isoleucine, K«Lysine, 
LsLeucine, M*Methionine, N«Asparagine , 
P-Proline, Q»Glutaraine, R=Arginine, 
S»Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 


*084 * 






DVQEETQLDLSGDSVKTIAKLWDSKMFAEIMMKIEEYISKQAKA- 

SEVMGPVEAAPEYRVIVDANNLTVEIENELNIIHKFIRDKYSKR 

FPBLES L VPNALD YI RTVKELGNS LDKCKNNENLQQ I LTNAT I M 

WSVTASTTQGOQLSEEELERLEEACDMALELNASKHRIYEYVE 

SRMSFIAPNLSIIIGASTAAKIMGVAGGLTNLSKMPACNIMLLG 

AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 

RRXAARIiVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 

QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 

ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 

RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAPTPLCJGLEI 

VNPQAAEKKVABANQKYFSSMAEFLKVKGEKSGLMST 


6085 


1865 


309 


KQWCAERRGLGMSLADEIiIADLEEAAEEEEGGSYGEEEEEPAIE 
DVQEETQLDLSGDS VKT IAKL WDSKMFAE I MMKI EEYI S KQAKA 

SEVMGPVEAAPEYRVIVDANNLTVEIENELNirHKFIRDKYSKR 
FPELESLVPNALDYIRTVKELGNSLDKCKNNENLQQ1LTMATIM 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
SRMS FI APNLS 1 1 IGASTAAKI MGVAGGI>TNLS KMPACXTC MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMS FGE I E EDAYQEDLG FS LGHLGXSGSGRVRQTQVNEATKA 
RI S KTLQRTLQKQS WYGGXS T I RDRS SGTAS S VAFTPLQGLE I 
VNPQAAEKKVAEANQKYFSSMAEFliKVKGEKSGLMST 


6086 


2 




SGPRSFQGNRAVGRISLGGKRNPBVTLLPGVijSERVRRWRRARV 
GVARVXPGNPWKPSPATQVPR/VPAQVYLPGRGPPLRBGEELVM 
DEEA Y VLYKRAQTGA PCLS FDI VRDHLGDNR TELPLTL YLCAGT 
OAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPQLE LAMVPH YGG INR VRVS WLGE BP VAGVWSE KGQVEVFALR 
RLLQ WEEPQALAAFLRD EQAQM KP I FS FAGHMGEG FALDWS PR 
VTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTR5VEDLQW 
SP TENTVFAS CSADAS 1 R I WDIRAAPSKACMLTTATAHDG0 VNV 
XSWSRREP FLLSGGDDGALKI WDLRQFKSGSPVATFKQHVAPVT 
SVEWHPQDSGVFAASGADHQlTQWDIiG/lVERDPEAGDVBAD^G 
LADLPQQLLFVHQGETELKELHWHPQCPGI.LVSTALSGFTIFRT 


6087 


2419 


1357 


CiAATQHGGAMNLL PCNFHGNG LL YAG FNQDHG CFACGMEKG FR V 
YMTD PL KE KE KQE FL EGG VGHVEMLFRCN YLALVGGG KKE K YP P 
NKVMIWDDLKKKTVIBIEFSTEVKAVKLRR\DKIWVLDSMIKV 
FTFTHNP\HQLHVFE\TCYNPKGLCVLCPNSNNSLLAFPGTHTG 
HVQLVDLASTEKPPVDIPAHEGVLSCIALNLQGTRIATASEKGT 
LIRI FDTSSGHL I QE LRRGS Q AANI YC I NFNQDASL I CVSSDHG 
TVHIFAAEDPKRNKQSSLASASFI>PKYFSSKWSFSKFQVPSGSP 
Cl^FGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 


6088 


476 


1877 


UWbQRTGLPITIFSRSFPLLTGSDLCBNMPCTCTWRNWRQWIRP 
LVAVI YLVS I WAVPLCVWELQKLEVG IHTKAWF I AGI FLLLT I 
PIS LWV I LQHLVHYTQ PELQK P 1 1 R I LWMVP I YSLDS WI ALKY P 
GIAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPNLVLILEAKI) 
QQKHFPPIiCCCPPWAMGEVLLFRCKLGVLQYTVVRPFTTIVALI 
CELLGIYDEGNFSFSNAWrYLVI INNMSQLFAMYCLLLFYKVLK 
EELSPIQPVGKFLCVKLWFVSFWQAWIALLVKVGVISEKHTW 
EWQTVEAVATGLQDFI ICIEMFLAAIA\HHYTFSYKPYVQEAEE 
GSCFDSFLAMWDVSDIRDDISEQVRHVGRTVRGHPRKKLFPEDQ 

TTAKISDEILSDTIGEKKEPSDKSVDS 




1684 j 


C*89 ■< 

; 

< 

i 


aASGLVRLLQQGHRCLLAPVAPKLVPPVRGVKKGFRAAFRFQI^" 
LERQSLLRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 
KTAFVNS CYXKS EEAKRQQLG I EKEA VLLNL KSNQELSEQGTQ F 
3QTCLTQFLEDE YPDMPTEGI KNL VDFIiTGEEWCHVARNLAVB 
3LTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
CQMTGKELFEMWiaiNPMGIiVEELKKRNVSAPESRLTRQSG\A 
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I SECT 
ID 
NO: 


Predicted " 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(AsAlanine, C=Cysteine, DwAsoarfcie AriH t?_ 
Glutamic Acid, F= Phenylalanine, GsGlycine, 
H*Hictidine, I=Isoleucine, K=Lysine, 
LoLeucine, M«Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S= Serine, ^Threonine, V*=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYPVGLyCDKKLlABGPQETVLVAJSEEAARVALRKLYGF 
TENRRPWNYS KPKBTLHAEKS I TAS 


6089 


3 


30S4 " 


TRLG I PGST I S 5 RPRLCALAAEGHFLGHS WTGS RAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQSLVKHSSGIKGSLPLQK 
LHLVSRSIYHSHHPTLKLQRPQLRTSFQQFSSLTNLPLRKLKFS 
PI KYGYQPRRNFWPARIATRLLKLRYLIU3SAVGGGYTAKKTFD 
QWKDMIPDLSEYKWIVPDIWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKLVLQKDD 
KGI PFIESLRKSLIDMYSBVLDVLSDYDASYNTQDHLPRWWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
LFKDS SR E FDLTKEEDLAALRH E I ELRMRKNVKEGCTVS P ET I S 
LNVKGPGLQRMVLVDLPGV1NTVTSGMAPDTKETIFS I S KAYMQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRlQQIIEGKLFPMKAliGyFAWTGKGNSSESIEAI 
REYEEEFFQNSKLLKTSMIJCAKQVTTRNLSLAVSDCFWKI1VRES 
VEQQADSFKATRFNLETEWKNN-YPRLRELDRNELFEKAKNEILD 

EVISLSQVTPKHWEEIljOOSTiWRRVQTH\77rKITVT r>* nmnvt^nr-. 

TFNTT VD I KLKQWTDKQL PNKAVEVAWET1»QEE FSR FMTE P KG K 
EHDD I FDKL KEAVKE ES I KRHKWND FAEDS LR VIQHNALE DRS I 
SDKQQWDAAIYFMEEALQARLKDTENAIENMVGPD\WKKRWLYW 
KNRTQEQCVHNET KNELE KMLKCNEEHPAYLAS DE I TTVRKNLE 
SRG VE VDPS I* I XDTWHQVYRRHFL KTALNHCNL CRRGFY Y YQRH 
FVDSELECNDWLFWR1QRMLAITANTLRQQLTNTEVRRLEKNV 

KEVLEDFAEDGEKKIKLLTGKRVQLAEDLKKVREIQEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS /AS PPLATQT VVPLQHCtfx PELPVQAS IL 
FELQLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHLID 

FNLLMVTTIVLGRRFIG^TVK'PfcQnPf'VVQT DDOTT T T3T mn T?rm r 

LTATG WSLCRSLIHLFRTYS FLNLL/FPLLSVWDVHS VPAAELR 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CMARTPCP/PIIACCLSPSLIRSEVEFLKMDFKWRMKEVLVSSML 
SAYWAFVPVWFVKNTHYYDKRWSCELFLLVSISTSVILMQHLL 
PA3YCDLLHKAAAHLGCWQKVDPALCSNVLQHPWTEECMWPQGV 
LVKKS KNVYKAVGH YNVA I PS DVSH FR FHFFFS KPLRI LN I LLL 

LEGAVIVYQLYSLMSSEKWHQTISLALILFSNYYAFFKLLRDRL 
VLGKAYSYSASPQRDLDHRFS 


6091 


3279 


412 


SSRTREMEEKBILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSIiVNRPPG 
PS DP PADHAVR PLHGARGG QP PVPQQH VLERQVQIiS QGQNWI K 
VKPPSKSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVS ESVI AVKAS F PSSAL PPRTGVALGRKLGSHS VASCAPQ 
LLGDRRVDAGHTDQPVPSGS VGGPARPASGPRQAREAS LWTCR 
TNKFRKNNYKWVAASSKS PRVARRALSPRVAAEWCXASAGMAN 
KVEKPQLIADPEPKPRKPATSSJCPGSAPSKYKWKASSPSASSSS 
5FRWQS EAGSKDHASQLSP VLSRS PSGD \ RPAVGHSGLKPLSGE 
TPLSAYKVKSRTKlIRRRGSTSLPGDKkSGTSPAATAKSHIiSLR 
RRQALRGKSSWLXKTPNKGLVQVTTHRLCRLPPSRAHLPTKEA 
SSLHAVRTAPTSKVIKl'RYRIVKKTPASPLSAPPFPLSLPSWRA 
RRLSLSRSLVI^RLRPVASGGGKAOPGSPlJWDc:vr v VDrTr«r«trr v 

KVSANKLSKTSGQPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 
RSLAIIRQARQRREKRKEYCMYYNRFGRCNRGERCPYIHDPEKV 
AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGICSNSN 
CPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 
ACPRGAQCQLLHRTQKRHS RRAATS PAPGPSDATARSRVS ASHG 
PRKPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSSSPPASI^HEAPSWEAALAAACSNRLCKLPSFISLQS 
S PS PGAQPRVRAPRAPLTKDSGKPLHI KPRL 


6092 


143 


3190 " 
1 


^KAPPTGESSEPEAKVLHTKRLYRAVVEAVHRLDLlLCNKTAYQ ' 
?VFKPENISLRNKLRELCVKI^Fl^P\^YGRJCAEELLWRKVYyE 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S»Serine, T=Threonine, V= Valine, 
^Tryptophan, Y*=Tyrosine, X«Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIQLIKTNKKHIHSRSTLECAYRTHLVAGIGFVQHLLLYIQSHY 
QLBLQCCIDWTHVTDPLIGCKKPVSASGKBMDWAQMACHRCLVY 
LGDLS R YQNELAG VDTBLLAE RF Y YQ ALS VAPQ IGMP FNQ IGTL 
AGSKYYNVEAMYCYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
LIFQMVIICLMCVHSLERAGSKQYSAAIAFTIiALFSHLVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFESDSSHD 
SARASEGSDSGSDKSLBGGGTAFDAETDSEMNSQESRSDLBDME 
E EEG TRS PTLE PPRGRSEAPDSLNG PLGPS EAS I ASNXjQAMS TQ 
MFQTKRCFRLAPTFSNLLIiQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDLIIVCAQSSQSLWNRLSVLLNLLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLSTLEESVVRICCIRSFGHFIARLQGSILQFNPEVGIF 
VSIAQSEQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 
SQLEGSLQQPKAQSAMS P YLVPDTQALCHHLP VIRQLATSGRF I 
VIIPRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAWTLYKILDSCKQLT\LAQGAGEEDPSG 
MVTIITGLPLDNPSLLSGPMQAALQAAAHASVDIKNVIiDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALR VART/ SRWGAL \ RGAVWAPGTRPS KRRAC WALL 
PPVPCCLGCLAERWRLRPAALGLRLPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTIPNMLSMTRIGLAP 
VLGYLI IEEDFNIALGVFALAGLTDLLDGFIARNWANQRSALGS 
ALDPLADKI LIS ILYVSLTYADLI PVPLT YM I I SRDVMLI AAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFISKVNTAVQLILVA 
ASIiAAP VFNYADSI YLQILWCFTAFTTAASAYSY YHYGRKTVQV 
IKD 


6094 


23 


1010 


PFLRCLRGDQKAKMSERKVLNKYYPPDFDPSKIPKLKLPKDRQY 
WRLMAPFNMRCKTCGEYIYKGKKFNMKETVQNEVYLGLPIFR 
FYI KCTRCLAE I TFKTDPENTD YTMEHGATRNFQAEKLLEEEEK 
RVQKE REDE ELNN PMKVLENRTKDS KLEMEVLENLQE L KD LNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLWVKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFAL INKLD IQCDLKTLSDDI KESLESEGKNS KKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS:14!6257J(%CSH0I!.DOC) 



445 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


rit! dlCC SQ 

beginning 
nucleotide 
location 
corresponding 
CO first 
amino acid 
residue of 

3 mi no an'H 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, DaAspartic Acid, B=» 
Glutamic Acid, F=<Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S«Serine, T«Threonine, V=Valine, 
W-Tryptophan, Y*Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibl« nucleotide insertion) 








VKVHTVPKPGKGADLSKPPCRKAKElRKBRKRI^KtMQQ^PAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLBDLIPESLPENASHKLEVR 
WRS S P P SS QPKATLLES VQVY KR YQMV IHKNPPDTPTES QFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYVDPDYSFLSLGVYSALREIAFTRQLHEKTSQLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPIEQCLPSLENSK 
YCRFNQDP EAVDEDRSTE PDRLQ VFHKRA I M PYGVYKKQQKDPS 
EEAAVLQYASLVGQKCSERMLLFRW 


6096 
6097 


2277 


575 


QR VRAALLS S AM EDSEALG FEHMGLD PRLLQAVTDIiG W SR PTLI 
QEKAXPLALEGKDLLARARTGSGKTAAYAIPMLQLLLHRKATGP 
WEQAVRG L VL VP T KELARQ AQSM IQQLATYCARDVRVANVSAA 
EDS VS QRA VLM EKPD WVG TPS RI I»S HLQQDS L KLRDS LELLW 
DEADLLFSFGFEEELKSLLCHLPRIYQAFLMSATFNEDVQALKE 
LIIiHNPVTLKLQESQLPGPDQLQQFQWCETEEDKFLLLYALLK 
LS I> I RG KS LL F VNTLERS YRLRLFLEQFS I P TCVLNGEL P LRSR 
CHI I SQPNQG F YDC VIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARANNPGIV 
LTPVIiPTEQFHLGKIEBLLSGENRGPILLPYQFRMEEIEGFRYR 

crdamrsvtkoairearlkeikebllhseklktyfednprVdlq 

LLRHDL PLH PAWKPHLGHVPDYL VP PALRGLVRPHKK\GRSCL 
PLVGRPREQSPRTHCAASSTKERKSDPQPSPPEWGPLWS 




1673 


192 


APGTWSGGKKKSSFQITSVTTDYEGPGSPGASDPPTPQPPT'app 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 

yrrgrwtcvdvyerdlephsfggllegirgasggaggrsldsrl 
elaslglgaptppsglsqgptswlrppptspgpoarsftgglgq 

LWPSKAKAEKPPIiSASSPQQRPPEPETGESAGTSRAATPLPSL 

rveaeaggsgartpplsrjrkavdmrlrmelgapeemgqvpplds 
rpsspalyfthdaslvhkspdpfgavaaqkfslahsmlaisghl 
dsdddsgsgslvgidnkieqamdlvkshlmfavreevevlkeqi 
relaernaaleqengllrala\speqlgsagpprgvpr\lgppa 

PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 

VQALSNGPWSPGPLPHLLIIPSLDGGGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ 


6098 
6099 


168 


1074 


NYCI^HRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASR5PEKCAQQRQK 
RLNS AS QRS S SLP PSNRKS S TPTKRE I MLTP VTVAYS P KRS P KE 

NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPBPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFBYTAKIRTLAETERFF\D 
ELTKEKDQI EAALSRMPSPGGR ITLQTRLNQEAFGRSFGKD 


6100 


168 


1074 


NYCLjRHRSPLEKDSSPGSSSTSIjLIKKQRETSDTPIMRALKELD 
EG K I FKNWGTQTE KEDTSN INPRQTETS VNAS RS PE KCAQQRQK 

RLNSASQRSSSLPPSNRKSSTPTKREIMLTPV1VAYSPKRSPKE 

NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 

\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 

TAW3KKKSVSYEQCXPVSVTPQGNDFEYTAKIRTLABTERFF\D 
ELTKEKDQ I EAALSRMPS PGGRITLOTR T.NOF aptcp q cyim 


6101 


2 


713 


FVEV5QYR3RADPEPRGRDTMTYAYLFKYIIIGDTGVGKSCLLL 
QFTDKRFQPVHDLTIGVEFGARMVNIDGKQIKLQIWDTAGQESF 
RS I TRS Y YRGAAGALL VYD I TRKE T FNHLTS WLEDARQHS S SNM 

VIMLIGKKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 

VEEAFINTAKEIYRKIQQGLFDVHNEANGIKIGPQQSISTSVGP 
SASQRNSRD IGSNSGCC 




1 


1399 

< 
j 


PRGRAWPLREVSHWLGCRRVCSWSASWGRliPALSARLSPLLAFR 
SKMVFPLSCAVQQYAWGKMGSNSEVAiUiLASSDPLAQIAEDKPY 
^LWMGTHPRGDAKILDNRISQKTLSQWIAENQDSLGSKVKDTF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid. F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=lsucine 7 M=Methionine, NaAsparagine, 
P= Proline, Q=G lut amine, R«Arginine, 
S -Serine, T=»Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFLFKVLSVBTPLSIQAHPNKELAEKLHLQAPOhYPDANH" 
KPEMAIALTPFQGLCGFRPVEE1VTFLKKVPEFQFLIGDEAATH 
LXQTMSHDSQAVASSLQSCFSHl^KSEKKVVVBQLNLLVKRISQ 
QaAAGNNMEDIFGEHjLQLHQQYPGDIGCFAIYFIiNLIiTIjKPGE 
AMFLEANVPHAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRIiFLPTRSQEDPYLSIYDPPVPDFTIMKA\EVP 
G\S VTEYKDLALDSAS IIjLKVQGTVIASTPTTQTPI PLQRGGVL 
F IGANESVS LKLTEPKDLLI FRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAGSIGAS PAAPCCSESGDERKN ' 

LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 

SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNAEES 

KQFLNQFLEQETHLFSAINSHLLTAQPWMDDLGTMISQIEEIER 

HLAYLKWISQIEELSDNIQQYLMTNNVPBAASTLVSMAELDIKL 

QESSCTHLLGFMRATVKFWHKIIiKDKLTSDFEEILAQLHWPFIA 

PPQSQTVGLSRPASAPEIYSYLETLFCQLLKLQTSHELLTEPK\ 

HSQKNTLFLPPLI^S/WPIOVMLT^Lr)KRFRVWPRrMT?nTvnn c 

KPEWYLAQVLMWIGNHTEFLDEKIQP1LDKVGSLVNARLEFSRG 
LMMLVLEKLATDIPCLLYDDNLFCHLVDEVIjLFERELHSVHGYP 
GTFAS CMH I LS E ETCFQRWLTVER KFALQ KMDS MLS S EAAWVS Q 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LOKDLVDDFRIRLTQVMKEETRASLGFRYCAIXiNAVWYISTVLA 
DWADNyFFLQLQQAALEVFAENNTLS KLQLGQIiASMES S VFDDM 
INLLERLKHDMLTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLF.KIFWQMLVEKLD 
VYIYQEIILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KH I KEACI VLNLN VGSALTAGKDVLPVQLQGS FPAT 


" *ioi 


207 


2523 


ESNSTMTTYLEFIQQNEERDGVRFSWNVWPSSRLEATRMWPVA^" 
AL F TP LKERPDLP PI Q YE P VLCS RTTCRAVLNPLCQ VD YRAKLW 
ACNFCYQRNQ FPPSYAGI S ELNQPAELLPQFSS IEY WLRGPQM 
PLI FL YWDTCMEDEDLQALKESMQMS LSLh PPTALVGL I TFGR 
MVQVHELGCEG I S KS YVFRGTKDLSAKQLQEMLGLS KVP VTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELQRDPWPVPQGKR 
PLRSSGVALS IAVGLLECTFPNTGAR IMMFIGGPATQGPGMWG 
DELKTP IRSWHDI DKDNAKYVKKGTKH FEALANRAATTGHVIDI 
YACALDQTGLLEMKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTLEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENEIGTGGTCQWKICX3LSPTTrLAIYFEWNQHNAPIPQGG\RG 
A\IQFVTQY\QHSSGQRRIRVTTIARN\WADAQTQIQNIAASFD 
QEAAA I LMARLAI YRAETEEGPDVLR WLDRQL IRLCQKFGE YHK 
DDPS S FR FS ET FS LYPQ FMFHLRRS S FLQ VFNNS PDESS Y YRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRILLM 
DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHLX.QAPVDDAQE 
IbHSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP I LTDDVSLQVFMDHL KKLAVSS AA 


6104 


124 


732 


KVSEYI I LSKDKI LFHALAMLVLWSPWSAARGVLRN Y^ERLLR 
KLPQSRPGFPSPPWGPAIAVQ\AQPCLQSQQMIPVEVKRI/RSL 
LDSIFWMAAPKNRRTIEVNRCRRRNPQKLIKVKNNIDVCPECGH 
LKQKHVLCAYCYE KVCKETAE I RRQ IOKQ EGGP FKA PT I ETWL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


SB9 


PLHGACTSLVLQRFCHRRPRPCAPARPEDMRRPAAVPLLLLLCF 
GSQRAKAATACGRPRMLNRMVGGC3DTQEGEWPWQVSIQRNGSHF 
CGGS L I AEQ WVLTAAHCFRNTS ETS L YQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSRDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CIiVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRI IPK 
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Amino acid segment containing signal peptide 
(A»Alanine, C»Cysteine, b«Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H«Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R^Arginine, 
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Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTSIESRGRPAAS 
AGLRRDRCALRRWPLRRAP LARATRRRAG S PRRCAPRPRAC PQG 
? W S RARHQ PGG LCLLLLLLCQFMEDRS AQAGNCWLRQAKNGRCQ V 
i L> ix. i ctidWib^Cb ioKliblbWrEEDVNDNTLFKWMIFNGGAPNC 
I P C KETCENVDCGPG KKCRMNKKNKPRCVCAP DCSN I TWKGP VC 
GLDGKTYRKECALLKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
STCV\ VDQTNNAYCVTCNRI CPEPASSEQYLCGNDGVTYS \ SAC 
HLRKATCLLGRSIGLAYEGKCIKAKSCEDIQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKJBAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSFPISSIliEW 


6107 


623 


168 


SRCSSPRPEPGRGRGK/LSPSEHRKWVEVFKACDEDKKGYLSRE 
DPKTAVVMLFGYKPSKIEVDSVMSSINPNTSGILLEGFLNIVRK 
KKEAQRYRNEVRH I FTAFDT YYRGFLTLEDFKKAFRQVAPKLPE 
RTVLEVFREV\ DRDS \ DGH VS F 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW - 1 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 1 
YYFYDLLVYWIGIFCLASATGLYSCIAPCVRRLP\SASAGESA I 
LLAPTIPNNSLPYFHKRPQARMLLLALFCVAVSVVWGVFRNEDQ 
WAWVLQDALGIAFCLYMLKTIRLPTFKACTIiLLLVIjFLYDIFFV 
FITP FLTKSGS S I MVEVATGPSDSATREKLPMVLKVPRLNS S PL* 
ALCDRPFSLLG FGD IL VPGLLVAY CHR FD IQ VQS SR VY FVACTI 
AYG VGLLVTF VALALMQRGQPAIiLYLVPCTL VTS CAVALWRRE L 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATS P WPAEQS PKSRTSEEMGAGAPMREPGS PAES EGRDQAQPS 
PVTQPGASA 


6109 




13 81 


CRSRAGAASGGAILEGTKLRRQRVDTNKPLDPLVPSALRAAMLY 
LEDYLEMIEQLPMDLRJDRFTEMREMDLQVQNAMDQLEQRVSEFF 
MNAKKNKPEWREEQMAS I KKD Y YKALEDADEKVQLANQ I YDL VD 
RHLR KIjDQELAKFKMELEADNAGITE ILERRSLELDTPSQPVWN 
HHAHSHTPVEKRKYNPTSHHTTTDHI P E KKFKS EAL l*STLTS DA 
SKENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 
GI\TMAAAQAVQATAQMKEGRRTSSLKASYEAFKNNDFQLGKEF 
SMARETVGYSSSSALMTTLTQRASSSAADSRSGRKSK2WNKSSS 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNE P R YC I CNQVS YGEM VG CDTQDCP I EWFHYGCVGLTEAPK 
GKHYCPQCT\AAMKRRGSRHK 


<ino 


77 


2464 


ACPSAATMSDQDKSMDEMTAWKIEKGVGGNNGGNGNG3GAFSQ 
ARSSSTGSSSSTGGGGQESQPSPLAItliAATCSRIESPNENSNNS 
QGPSQSGGTGELDLTATQLSQGAMGWQIISSSSGATPTSKEQSG 
SSTNGSNGSBSSKNRTVSGGQYVVAAAPNLWQQVLTGLPGVMP 
N I QYQV I PQ FQTVDGQQLQFAATGAQ VQQDGSGQ I Q 1 1 PGANQQ 
I ITNRGSGG^ IAAMPNLLO^AVPLQGLANNVLSGQTQ YVTNVP 
VALNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
GTTI S S AS LVSSQ AS SS SF FTNANS YS TTTTTSNMG IMNFTTSG 
SSGTNSQGQTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGE 
Q\NQQTQAAPKSI,SRPQLVQGG\QALQ\AFQAAPLSGOTFTTQA 
I SQETLQNLQLQAVPNSGPI 1 1 RTPTVGPNGQVSWQTLQLQNLQ 
VQNPQAQT I TLAPMQGVSLGQTS SSNTTLTPIASAAS I PAGTVT 
VNAAQLSSM PGLQTINLSALGTSG IQVHP IQGLPLAI ANAPGDH 
GAQLGLHGAGGDG I HDDTAGGEEGENSPDAQPQAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKQHI CH I CGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFl^lRSDHLSKHIKTHQNKKGGPGVALSVGTLPLDSGAGSEGSGT 
ATPSALITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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6111 
6112 


1637 
77 


797 
196 


RVD PR VRG AMAP WG KRLAG VRGVL LDI SG VL YDSGAGGGTAI AG 
S VEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFD I SEQE 
VTAPAPAACQILKERGLRPYLLIHDGV\ASEFDQIDTS/STPNC 
WIADAGES FS YQNKNNAPQVLMELEKPVLISLGKGRYYKETSG 
LMLDVGPYMKALEYACGIKAEVGGKPSPEFFKSALQAIGVEAHQ 
AVM IGDD I VGD VGGAQRCGMRALQ VRTGKPRPSDEHHPEVKADG 
YVDNLAEAVDLLLQHADK 


6113 


1779 


567 


MS SHKS FKSKRFLAKXQKPNRPI LQW I WLKTGNJCI RHNWK 

WEGRSWAACGVNLQGAWGERSGVRASEAESPGKRADVSWWSRQL 

ETMVDHLANTEINSQRIAAVESCFGASGQPLALPGRVLLGEGVL 

TKECRKKAKPRIFFLFNDILVYGSIVLNKRKYRSQHIIPLEEVT 

LELLP ETLQAKNRMM 1 KTAKKS F WS AAS ATERQEW I SH I EE C V 

RRQ LRATGRPA \ S TEHAAP W I PD KATD I CMRCTQTRFSALTRRH 

HCRKCRWVCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 

EEAEEQGAGVPRAASHIiARP I CGRPVEMTMTPTRTRRAAG7ATG 

PAAWSSTPRGWPGLPSTADPRPAEHLSPSQLHCPGPQEGSSRSC 

PGLRDPIPWKQVQRWGVALSGIiPVPFCWTLCPYGFTAGNAFPF^ 

KPQNTHRSW 


6114 


816 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCIPVPACWPSPPP\PAEQVQC 
GHLPPHADRRALRLP VAAPARG PGPGHPAGP AG PRPARTP PAS P 
HGPGRP T VPAPP CPLLAATE PTPSR PHQRWTRE DRMLGRGSQ VT 
GRPQ WFLRGL VL FS L 


6115 


324 


71 


D VCGRVCAHPHLYTH I HM H I CAiiAd \ I HTHAQLC/ 1 TASHALAH - 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


tgvmppgrwhaa/isssgpvfegara\lqtvkkeeedesytpvq 

AARPQTLNRPGQELFRQLFRQLRYHBSSGPLETLSRLRELCRWW 

lrpdvlskaqilellvleqflsilpgelrvwvqlhnpesgee\l 

VI PCWR S CRG TLMGHPGGTRAL P \ E PRCALDGYRS \ LRS AQI WS L 

ASPLRSSSALGDHLE ppyeieardflagqsdtpaaqmpalfpre 
gcpgdqvtptrsltaqlqetmtfkdvevtfsqdewgwldsaqrn 

U I rCUVM LiJ&N X KNMAS LGK 


6117 


1433 


222 


VGVPS PAPPCSWEVGPGGGWTPG I LKEGQGGRRTPLLLLATRTR 
GLLSLFPPAAMHPAAFPLPVWAAVLWGAAPTRGLIRATSDHNA 
SMDFADLPAliFGATLSQEGLQGFLVEAHPDNACSPIAPPPPAPV 
NGSVFIALLRRFDCNFDLKVLNAQKAGYGAAVVHNVNSNEIjLNM 
VWNSEEIQQQIWIPSVFIGERSSEYIjRALFVYEKGARVLLVPDN 
TFPLGYYL I P FTG I VGLLVLAMGAVMI ARCI QHRKRLQRNRLTK 
\z.ULtivu x \ P I HDYQKGDQYDVCAICLDEYEDGDKLRVLPCAHAY 
HSRCVDPWLTQrRKTCPICKQPVHRGPGDEDQEEETQGQEEGDB 
GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


611B 
"■6"119 " 


1044 


247 


STI S CRACTSGATPG AQS HRS ARGHAAGGK ETAALGMERGKVKK 

KEKEKETQKEKIGEKGREEKVKRKEVEQKIKGEKQEKQERRKGK 

EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 

NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCINTE 

DSOMEFLEIGGSKPFRSYWEMYLSN/ADSLARSFSVGFKQDSQP 

ITWKAXKYLHQLIAANPVLPLVVFANKQDLEAAYHITDIHEALA 
II 




1217 


462 


DPR FVTENTTKAPAQERTTQPRS S REGTLR S TME YLS ALN PS DL 
LRSVSNISSEFGRRVWTSAPPPQRPFRVCDHKRTIRKGIiTAATR 
QE LLAKATj ETLLLNGVLTL VLEEJDGTAVDSBDF FQLLEDDTCLM 
VLQSGQSWSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 
DLFGSLNVKATFYGIjYSMSCDFQGL\GPKKVLRELLRWTSTLLQ 
GLGHMLLG ISSTLRHAVEGAEQWQQKGRLiHS Y 
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\=possible nucleotide insertion) 


6120 


| 785 


179 


LERAGGGGLSSRALVGSGACLSLVARANGKGLPRGRKEFVBAVR 
VRYVAFRYRTPRAVCLRLWSCRREVIMSGRGKQGGKVRAKAKSR 
S S RAG LQ F P VGRVHRLLRKGNYAERVGAGAP VYLAAVLBV LTAE 
I LELAGNAARDNKXTR 1 1 PRHLQLAIRNDEELNKLLGKVTI AQG 
G\VLPNIQAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


FVRAQARGSRQPVRRPLLGAGSRLRCRSCGRMEPLKVEKFATAN 
RGN^GLRAVTPIiRPGELLFRSDPLArTVCiCGSRGWCDRCtiLGKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRECKCLKSCKPRYPPDS 
VR LLGR WF KLMDGAPS ES E KL YS F YDL ESN INKLTED KKEGLR 
QLVMTFQHFMREEIQDASQLPPAFDLFEAFAKVICNSFTICNAE 
MQEVGVGLYPSISLLNHSCBPNCSIVFNGPHLLLRAVRDIEVGE 
E LT I CYLDMLMTSEERR KQLRDQ YC FECD\ CFRCQTQD KDADM L 
TGDEQVWKE VQESIiKKIEELKAHWKWEQVLAMCQAI 2 S SNSERL 
PDINIYQLKVLDCAMDACINLGLLEEALFYGTRTMEPYRIFFPG 
S HP VRGVQVMKVGKLQLHQGM F PQAMKNLRLAFDI MRVTHGREH 
SLIEDLI LLLE/ AMRRQHQS ILRERSQREIRRVSLLNALLRSHT 
LCF VS CVNLS YWKFCS VFV 


6122 


2 


2324 


llFRKMAiXSGAASQDESSAAAAAAADSRMNNP SETS KPSMESGDG 
NTGTQTNGLDFQ KQ P V P VGGA3 STAQAQAFLGHLHQVQLAGTS L 
QAAAQSLNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQLMLA 
GGQITGLTLTPAQQQLLLOQAQAQAQIiliAAAVQQHSASQQHSAA 
GAT I S AS AAT PMTQ IPLSQPIQI AQDLQQLQQLQQQNLNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 
LLQSQPRI \ TLTSQPATPTCTIAATP IQTLPQSQSTPKR I DTPS 
LEEP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGLAMVKLYGND 
FS PTTI FRFEALNLS FKNMCKliK PL LE KWLNDAEN LS S D S S LSS 
PS ALNS PG I EG LS RRRKKRTS I EA\ N I RVALEKS FLEN\ QKPTS 
EEITMIADQLNMEKGVIRWFCNRRQKEKRINPPSSGG\TSSSP 
I KAI FPS PT SIiVATTPS LVTSS AATTLTVSP VLP LTS AA VTNLS 
VTGTSDTTSNNTATVISTAPPASSAVTSPSLSPSPSASASTSEA 
SSAS ETSTTQTTSTPLS S PLGTSQVMVTASGLQTA/AQLLPFKG 
AAQLPANASLAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLATIQALASGGSLP ITSLDATGNLVFANAGGA 
PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 
TSTS AE S I QNS L FTVASASGAAS TTTTAS KAQ 


6123 


3 


2S44 


HLLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGILHL 

HLQPLEMKRVGWFTPADYGKVTSLILXRNNLTVIDMIGVEGFG 

ARELLKVGGRLPGAGGSLRFKVPESTLMDCRROLKDSKQILS IT 

KNFKVENIGPLPITVSSLKINGYNCQGYGFEVLDCHQFSLDPNT 

SRDISIVFTPDFTSSWVIRDLSLVTAADLEFRFTLNVTLPHHLL 

PLCAD W PGPSWEES FWRLT V FFVS LSLLG VIL 1 AFQQAQ Y I LM 

EFMKTRQRQNASSSSQQNNGPMDVISPHSYKSNCKNFLDTYGPS 

DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 

HKTSTAAASSTSTTTEEKQTSPLGSSLPAAKEDICTDAMRENWI 

SLRYASGINVNLQKNLTLPKNLLNKEENTLKNTIVFSHPSSBCS 

MKEGI QTCM FPKETD I KTS ENTAE FKERELC PLKTS KKLPENHL 

PRNSPQYHQPDLPEISRKNNGNNQQVPVKNEVDHCENLKKVDTK 

PSSEKKIHKTSREDMFSEKQDIPFVEQEDPYRKKKLQEKREGNL 

QNLNWSKSRTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSELS 

SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 

KCVDKFCSDSSSDCGSSSGSVKASRGSWGSWSSTSSSDGDKKPM 

VDAQHFLPAGDSVSQNDFPSEAPISLNLSHNICNPMTGNSLPQY 

AE PS C PS L P AGPTG VEED KGL YS PGDLWPTP P VCVTS SLNCTLE 

NGVPCVIQESAPVHNSFIDWSATCEGQFSSAYCPLBLNDYNAFP 

E ENMN YANG FP CPAD VQTDF I DHNS QS TWNTP P\NM PAS \ WGNA 

QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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PTTEHSD/THMENQA\WCKEyYPGF\NPFRAYMNljblW^ 
NRNANFPLSRDSSYCGNV 


6124 


1S73 


236 


SDKAIjRI^GERGMGRVQLFEISLSHGRVVYSPGEPLAGTVRVRL 
GAPLPFRAIRVTCIGSCX5VSNKANDTAWVVEEGYFNSSLSLADK 
GSLPAGEHSFPFQFLLPATAPTSFEGPFGKIVHQVRAAIHTPRF 
S KDH KCS LVFY I tiS PLNLNS I PD I EQ PNVAS ATKKFS YKLVKTG 
S WLTAS TD tjRG Y WGQAIiQLHADVENQSGKDTS P WASLLQ KV 
SYKAKRWIHDVRTIAEVEGAGVKAWRRAQWHEQILVPALPQSAL 
PGCSIr I H I D Y YLQ VS LKAPEAT VTLP VF IGNI AV/N PC PS E P PA 
RPGAASWGPTPGG\PSAPPQEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 

PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGATE 
PSLTPES 


6125 


1 


904 


KTCPKLTCAF TVS VP DSCCRVCRGDGELSWEHSDGD I FR"QPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
IVQIVINNXHKHGQVCVSNGKTYSHGESWHPNLRAFGIVECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCGBETMPVYESVFMEDGETTRK1ALETERPPQV 
EVHVWTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTOLEDLVKVLYLERSEKGHC ' 


6126 
6127 


1224 — 


389 


RliLSEAPCPRSRRRFQMNPEWGQAFVHVAVAGGLCAVAVF^GIF 
DS VS VQVG YEH YAE AP VAGLPAFLAMP FNSL VNMA YTLLGLS WL 
HRGGAMGLGPRYLKDVFAAMALLYGPVQWLRLWTQWRRAAVLDQ 
WLTLPIFAWPVAWCLYLDRGWRP\MLFLSLECVSIASYGLALLH 
PQG FEVALGAHWPAVGQAIiRT \HRH YG/SATPSATYLALGVLS 

CLGFWLKLCDHQLARWRLFQCLTGHFVISKVCDVLQFHFAFLFL 
THFNTHPR FHPS GG KTR 




1335 


463 


VLPRRCLVFWNTMDSSRBPTLGRLDAAGFWQVMQRFDADEKGY 
IEEKELDAFFLHMLMKLGTDDTVMKANLHKVKQQFMTTQDASKD 
GRIRMKELAGMFLSEDENFLLLFRRENPLDSSVEFMQIWRKYDA 
DSSGFISAAELRNFLRDLFLHHKiCArSEAKLEEYTGTMMKIFDR 
NKIX3RLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKIFA 
YYDVSKTGALEGP\EVDGFVKDMMELVQPSISGVDLDKFREILL 
RHCDVNKDGKIQKSBLALCLGLKINP 


6128 


2511 


843 


TGRMSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLEMSSKD 
SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSS RRAPPS EGTRR 
GGAARPEKTAEEGPPAAPGSLRHSGPLGPHACPTAIiPEPQVTSA 
MSSQWGIEPLYIKAEPASPDSPKGSSETETEPPVA1jAPG\PAP 
TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
GYHYGVASCEACiCAFFKRTIQGSIEYSCPASNECEITKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 
FPAG P LAVAGGP R KTAAP VNALVSHLLWE PEKL YAM PDPAG PD 

GHLPAVATLCDLFDREIWTISWAKSIPGFSSLSLSDQMSVLQS 
VWMEVLVLGVAQRSLTLQDELAFAEYLVLDEEGARPAGLGELG\ 
AALLQLVRRLQALRLEREKYVLLKALALANSDSVHIEDEPRLWS 
S CEKLIiH EALLE YE AG RAGPGGG AERR RAGRLLLTL PL LRQTAG 
KVLAHFYGVKLEGKVPMHKLFLSMLBAMMD 


6129 
6130 


1764 
3 


771 
577 


ARFARS AHEGKMP KKKTGARKKAENRRERE KQLRAS RS TIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQICLPICAQCGKTKCMM 
KSSDCVIKHAGVYSTGLAMVGAICDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQ VLEAE TPK CVS CNRLGQHS CLRCKACFCDDHTRS KVFKQEKG 

KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
YWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQEN 

3RGGTMRE YK VWLGSG \ G VGKS ALTV \QFVTCTFI E K YD PT 1 E 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid f F= Phenylalanine, G°Glycine, 
H«Histidine, I*=Isoleucine, K=Lysine, 
ii-jjeucxne, w~Metnionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








DFyRKEIEV\DSSPSVAGISWTQQGTEQFVASMRDLYIKKGQGC 
ILVYSLVNQQSPQ\DIKPMRDQIIRVKVSEKVPVI\LVGN\SVD 
LESEREVSSSEGRALAEEWGCPFMETSAKSKTMVBELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPPSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGSPRHLPSCSPALLLLVLGGCLGVP 
GVAAGTRRPNWliLLTDDQDEVLGGMTPIiKKTKALIGEP^GMTFS 
SAYVPSALCCPSRASIIiTGKYPHNHHWNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCGYQTFF\AGKYLNE YGAPDAGGLEHVPLGW 
S YWYALBKNS KYYNYTLS INGKARKHGENYSVDYLTDVLANVSL 
D FIjD YKSNFEP F FMMTATP \ APHS P WTAAPQ YQKAFQNVFAPRN 
KNFN I HGTNKHWLIRQAKTPMTNS 2 IQ FLDNAFRKRWQTLLS VD 
DLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSLPIDKRQLY 
E FD X KVPLL VRG PG I KPNQTS KML VAN I DLG PT I LD I AG YDLN K 
TQMDGMSLLPI LRGASNLTWRS DVLVEYQGEGRNVTD PTCPSLS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALWNLQYCEFDDQEVFV 
BVYNLTADPDQITNIAKTIDPELLGKMNYRLMMLQSCSGPTORT 
PG VFDPG YR FD PRLM FS NRGS VR TRRFS KHLL 


6132 


96 


1241 


AAGLLPPGLVPEDPRRTRNLLPFGIQGPPFALSRPLFSCVESGW 
AWEAMEPEFLYDLLQLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRLAALKLEAEDIALTATSQKHKLTWLEAVNRSNCSWRSGRP 
SGA/WESIFNKDLLSTLHLLVAUUCRFQPDLSLPTNVQVEVITI 
ES TKSGLKS E KL VEQLTE YS TDKDE P PKDVFDEL FKLAPE KVNA 
VKEAIVNFVNGKLDRLGLSVQNLDTQFADGVILLLLIGQLEGFF 
LHLKEFYLTPNSPAEMLHNVTLALELL/IGRGPAQLPC/LALK/ 
TIVNKDAXSTLRVLYGLFCKHTQKAHRDRTPHGAPN 


6133 


2 


4256 


FVHG S MADTDL FMECEE E E LEPWQKI S DVI EDS WED YNS VDKT 
TTVS VSQQ PVSAPVPIAAHAS VAGHLS TSTTVSS SGAQNSDSTK 
KTLVTLIANNNAGNPIiVQQGGQPLILTQNPAPGLGTMVTQPVLR 
P VQVM QNANHVTSS P VASQP I FITTQGFPVRNVRP VQNAMNQ VG 
I VLNVQQGQTVR P I TLVPA PGTQFVKPT VG VPQVFSQMTPVR PG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPSPPAVSIASFVT 
VKRPGVTGENSNEVAFCLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQDTNFPKVATSFRCPHCTKRLK^ 

IiDQQNGEVDGHTICQHCYRQFSTPFQIjQCHLENVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VKGRQTCLECS FE I PDFPNHFPTYVHCSLCR YST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYP PPS FPTNKAATVKSAGATPAE PEEI#LTPLAPALPSPA 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
LASGGGGS GG VGKKEQLSVKKIiRVVLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKliAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTL PKD VAENAGL F IDFVQRQ I HNQD LPLS M I VA IDE I S LFL 
DTEVLSSDDRKENALQTVGTGB PWCTWLAILADGTVLPTIiVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
R5KGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
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ID 
NO: 
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nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to tirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H=Histidine, I»Isoleucine, K^Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y^Tyrosine, X«Unknown, .*-Stop 
Codon, /opossible nucleotide deletion, 
\=pos3ible nucleotide insertion) 








DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVI GDCPELVQRS PLVAS VLPGPDGNINS PTRNADMQEBLI AS 
IiEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLPEGESETES 
FYGFEEADLDLMEI 


6134 


2 


4256 


PVHGSMADTDLFMECEEEELEPWQKISDV1EDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLILTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTSS PVASQPI F ITTQGFPVRNVRPVQNAMNQVG 
IVLNVQQGQTVRP I TLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMP VR PTTNTFTTV I P ATLT I RS T VPQSQSQQTKS TPS TSTTP 
TATQ P TS LGQIAVQS PG QSNQTTNP KLAPS FPS P PAVS I AS FVT 
VKR PG VTGBNS NE VAKL VNTLNT I PS LGQS PG P WVSNNS S AH\ 
GSQRTSGPBSSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEA1* 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEXTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVAT SFRCPHCT KRIjKNN I R FMNHMKHHVE 

u3qqkgevdghti cqhcyrq fstpfqlqchlenvhs p yes ttkc 
kicewafeseplflqhmkdthkpgempyvcqvcqyrsslysevd 
vhfrmihedtrhllcpyclkvfkngnafqqhymrhqkr\nvyh\ 
cnkcrvqflfakdkiehklqhhktfrkpkqleglkpgtkvtira 
srgqprtvpvssndtppsalqeaapltssmdplpvflyppvqrs 
iqkravrkmsvmgrqtclecsfeipdfpnhfptyvhcslcryst 

CCSRAYANHMINNHVPRKSPKYIiALFKNSVSGIKLACTSCTFVT 

s vgdamakhlvfnp shrs s s i lprglt wi ah5 rhgqtrdr vrdr 
nvknm yppps fptnkaatvksagatpaepeelltplapalp s pa 
statppptpthp qalalp plategae clnvddqdegs p vtqe pe 
lasggggsggvgkkeqlsvkklrwlfalcckteqaaehfrnpq 
rrirrwlrrfqasqgenlegkyiis feaeeklaewvltqreqqlp 
vneetlfqkatk igrs legg fki s ye wavr fmlrhhlt pharra 
vahtlpkdvaenaglfidfvqrqihnqdlplsmivaideislfl 
dtevlssddrkenalqtvgtgepwcdvvlailadgtvlptlvfy 
rgqmdqpanmpdsilleakesgysdde imelws trvwq khtacq 
rskgmlvmdchrthls eevlamls as s tlpawpagcs s k i qpl 
dvci krtvknflhkkwkeqaremadtacdsdvllqlvlvwlgev 
lgvigdcpelvqrs flvasvlpg pdgnins ptrnadmqeelias 

leeqlklsgehsesstprprsspeetiepeslhqlfegesetes 
fygfeeadldlmei 


6135 


2 


4256 


F VHGSMADTDLFME C E EEE LE P WQ KI SD V I EDS WED YNSVDKT 
TTVS VSQQ P VS AP VP I AAHAS VAGHLS TSTTVS S SGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPtilLTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTSS PVASQPI FITTQGFPVRNVRPVQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQ PTS LGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS I AS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GS QRTSG PES SMKVTS S I PVFDLQDGGRKI C PRCNAQFR VTEAL 
wnnv, X hnva x UKwKoIiDSE PS VPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPMENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNN I RFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLBNVHSPYESTTKC 
KI CE WAF E S E PLFLQHMKDTKKPGEMP YVCQVCQ YRS SL YSE VD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alaninc, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H»Histidine, I^Isoleucine, X=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S VGD AMAKHL VFNPSH RSS S I LPRGLTW X AHS RHGQTRDR VHDR 
NVKNMYPP P3 FPTNKAATVKSAGATPAE PEELLTPLAPALPS PA 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
jjrtovjLavjoo vao v uiUULULia v K^bRVvLFALCCNTEQAAEHFRIJPQ 
R R I RR WLRR FQAS QG ENLEG KYLS FEAEB KLAEWVLTQR EQ QLP 
WEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTL PKD VAENAGL F I DFVQRQI HNQDL PLS MI VAI DE I S LFL 
DTEVLSSDDRKENALQTVGTGEPWCDVVLAlLADGTVLPTIiVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
DVCI KRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGBV 
LGVIGDCPELVQRS FLVAS VLPGPDGNINSPTRNADMQEE L IAS 
iji2.EyjjKi»bC»iiHSeSSTPRPRSSPEETIEPESLHQLFEGESETES 
F YGFEEADLDLME I 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRLGQVAS " 
SLFRGEHHSRGGTGRLAS LFSSLEPQIQP VYVP VPK\ ESALASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKILDDTEDTWSQR-CKIQ 
INQEE ERLKNERTVF VGNLP VTCN K KKLKS FFKEYGQI ESVRJFR 
SLIPAEGTLSKKLAAIKRKIHPDQKNINAYVVFKBESAATQALK 
RNGAQI ADG FRIRVDLASETSSRDKRS VFVGNLP YKVE ESAI EK 
HFLDCGSIMAVRIVRDKMTGIGKGFGYVLFBNTDSVHLALKLNN 
SE LMGRKLRVMRS VNKEKFKQQNSNPRLKKVS KPKQGLNFTS KT 
AEGHPKSLF1GEKAVLLKTKKKGQKKSGRPKKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRFGRVPEERPAPPRERKHPG 
MWNMLIVAMCLA\LLGLPGKAQELQGHVS\IILAGEQLGDLAKK 
YLWQG\LFQLYLDEAGRGHSFSFHGAALTAPKQGQELMAKALES 
LSCPKDMAPSHCAEHKDQFLQLSQYRQLKTAEDYQALNKDIEAQ 
LQHAGLREAGGIFYFSVPPFAYEDIARNINSSCRPGPGAWLRVV 
LEKPFGHDHFSAQQLATELGTFFQBEEMYRVDHYLGKQAVAQIL 
PFRDQNRKALDGLWNRHHVERVEI I MKET VDAEGRTS FYE E YG V 
IRDVLQraLTBVLTLVAMELPHNVSSAEAVLRHKLQVFQALRGL 
QRGSAWGQYQS YS EQVRRELQKPDS FHSLTPTFAGVLVH IDNL 
RWEG VP FILK S G KALDER VG YARI LFKNQACCVQS E KHWAAAQS 
QCLPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPG 
LRLFGSPLSDYYAYSPVRERDAHSVLLSHI FHGRKNFFITTENL 
LASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
ANDI EATAVRAVRR FGQFHLALSGGS S PVALFQQLATAHYG FP W 
AHTHLWLVDERCVPLSDPESNFQGLQAHLLQHVRIPYYNIH\AI^I 
PVHLQQRLCAEEDQGAHIYAREISALGANSSFDLVIjLGMGADGH 
TASLF PQS PTG LDG EQL WLTTSPSQ PHRR KSLSL PL INRAK KV 
AVLVMGRMKREI TTLVSR VGHEPKKWP ISGVLPHSGQLVW YMDY 
DAFLG 


6138 


4587 


9*4 


EFS KLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTS VENLFRFL 
TDTSHLLSAVKGQERFSLYQTRSLIHELKWKEIHFQRRRTTCAL 
TLEAGEICLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQSTVETWDQCEKKIKEUCSRLQVLKAQSEDPLPELHEDLHNEK 
ELI KELEQSLASWTQNLKELQTMKADLTRHVLVEDVMVLKEQIE 
HLHRQWEDLCLRVAIRKQEIEDRLNTWWFNEKNKELCAWLVQM 
ENKVLQTADISIEEMIEKLQKDCMEEINLFSENKLQLKQMGDQL 
I KASNKS RAAEI DD KLNKINDRWQHLFD V I GS RVKKLK E T FAFI 
QQLDKNMSNLRTWLAR I ESELS KP WYDVCDDQ E IQKRLAEOQD 
LQRD I EQHS AG VES VFNI ODVLLHDSDACANETECDS I QQTTRS 
LDRRWRNICAMSMERRMKIEETWRLWQKFLDDYSRFEDWLKSAB 
RTAACPNSSEVLYTSAKEELKRFEAFQRQIHERLTQLELINKQY 
RRLARENRTDTASRLKQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cyeteine, D^Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycirie, 
H=Histidine, I=*Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NQREE FEGTRES I IiVWLTEMDIjQ LTNVEH PS ES DADDKMRQLNG 
FQQEITLNTNK2DQLrVFGEQLIQKSEP\LDAVLIEDELEELHR 
YCQEVFGRVSR FHRRLTSCTPGLEDEKEASENETDMEDPRE I QT 
DS WR KRG ES EE P SS PQSLCHLVAPGHERSGCETPVS VDS \ I PLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEGPRVLNGNPQQEDGGLAGITEQQSGAFDRWEMIQAQEL\HNK 
LKIKQNLQQLNSDISAITTWLKKTEAELEMLKMAKPPSDIOEIE 
LRVKRLQEILKAFDTYKAIiWSVNVSSKEFIiQTESPESTELQSR 
LRQLSLLWEAAQGAVDSWRGGLRQSLMQCQDFHQLSQNLLLWLA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQEISNSLLIKGHGEDCIEAEEKVHVI\EKKLKQLREQVSQDLM 
ALQGTQNPAS PLPS FDEVDSGDQPPATS VPAPRAKQFRAVRTTE 
GBEETESRVPGSTRPQRSFLSRWRAALPLQLLLLLLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERRPWEASSWKTL/LAGWIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTLSCIRWYRRESMFGFFKGMSFPLASIAVYNSWFGVFSN 
TQRFIiSQHRCGEPEASPPRTLSDLLLASMVAGWSVGLGGPVDli 
IKIRLQMQTPPVSGRQPRFEVQGSGSOG\EPAYQGPVHCITTIV 
RNEG LAGL YRGASAMl»IiRDVPG YCLYF I P YVFLSE W I TPEACTG 
PSPCAVWLAGGMAGAI SWGTATPMDWKSRLQADGVYLNKYKGV 
LDC I SQS YQKEGL KVF F RGITVNAVRGFPMS AAMFLGYE LS LQA 
IRGDHAVTSP 


6140 


694 


136 


rpelelwrlrsrswrplgvprrchrrnwkbpvrAOplsvtvwap 
rcqrp/qppapepsspnaavpeai ptpraaasaalelplgpapv 
svapqaeaearstpgpagsrlgpetfrqrfrqfryqdaagprea 
frqlrel/sprqwlrpdl\rtkeq\lvemlvqeqllaiiipeaar 
arrirrrtdvritg 


6141 


2 


984 


AQVGPRSRPCKMPLKLRGKKKAKSKETAGLVEGEPTGAGGGSLS 
ASRAPARRLVFHAQLAHGSATGRVEGFSSIQELYAQIAGAFEIS 
PSE I LYCTLNTPKIDMERLLGGQLGLEDFI FAHVKGIEKE VNVY 
K5EDS LGLTI TDNG VG YAF I KRIKDGGVIDSVKTICVGDHI ES I 
NGENIVGWRHYDVAKKLKELKKEELFTMKLIEPKKAFEIELRSK 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLE L YMG I RD I DLATTMFEAGKDKVN PD E FAVALDETLGDFAF P 
DBF VFD VWG VI GDAKRRGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARIGPGVMESKEERAl^N 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FR VRQ P 1 1^3 YRWD I MHRLGE PQARMR EENMERI GE E VRQLME KL 
REKQLSHSLRAVSTDPPHHDHHDEFC\LMP i 


6143 


2802 


270 


FRMR I FLHCPWNQQMWKI WNLLETSLESCKAHLS IQKLLKER \Q 
\QLPVFKHRDS I VETLKRHRWWAGET\GSGKSTQVPHFLL ED 
LLLN E WEASKCN I VCTQ PRR I S AVSIiANRVCDE LGCENGPGGRN 
SLCGYQIRMESRACESTRLLYC1TGVL1>RKLQEDGLLSNVS/HM 
FI VDE V\HER\ S VQSDFLL I I LKE I LQKRSDLHL I LMSATVDS E 
KFSTYFTHCPILRISGRSYPVEVFHLEDIIEETGFVLEKDSEYC 
QKFLEEEEBVTINVTSKAGGIKKYQEYI PVQTGAHADLNPFYQK 
YSSRTQHAILYMNPHKINLDLILELLAYLDKSPQFRNIEGAVLI 
FLPGIiAHIQQLYDLLSNDRRFYSERYKVIALHS ILSTQDQAAAF 
TLPPPGVRKIVLATNIAETGITIPDWFVIDTGRTKENKYHESS 
QMSSLVETFVSKASALQRQGRAGRVRDGFCFRMYTRERFEGFMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDPLSKALDPPQLQVISN 
AMNLLRKIGACELNEPKLTPLGQHLAALP VNVKIGKML I FGAI F 
GCLDP VATtAAVMTE KS PFTT P XGRXDEADLAKSALAMADSDHL 
T I YNAYLGWKKARQ EGG YRS E I TYCRRTi FLNRTSLLTLEDVKQE 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A<=Alanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PaProline, Q=Glutaraine, R=Arginine, 
S=Serine, T=*Threonine, V«Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGNRASQTLSFQBIALLKAVLVAGL 
YDNVGKIIYTKSVDVTEKLACIVETAQGKAQVHPSSVNRDLQTH 
GWLLYQEKIRYARVYLRETTLITPFPVLLFGGDIEVQHRERLLS 
IDGWIYFQAPVKIAVIFKQLRVLIDSVLRKKLENPKMSLENDKI 
LQIITELIKTENN 


£144 


1289 
» 


568 


SGPGS MSGQRVDVKWMLGKB Y VGKTS L VE R YVHDRFLVG P YQN 
VSASGGARHGGRGSGGPVICTYGPDLFPLVA\TIGAAFVAKVMS 
VGDRTVTLGIWDTAGSERYEAMSRIYYRGAKAAIVCYDLTDSSS 
FE RAKFWVKE LRS LEEGCQI YLCGTKS DLLEEDRRRRRVDFH DV 
QD YADN IKAQLFE TSS KTGQ S VDEL FQKVAEDY VSVAAFQVMTE 
DKGVDLGQKPNPYFYS CCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 
GPMVYAICYCpLPRLADIiEALKVADSKTLLESERERLFAKMEDT 
DFVGWALDVLSPNLISTSMLGRVKYNLNSLSHDTATGLIQYALD 
QGVNVTQVFVDTVGMPETYQARLQQSFPG I EVT VKAKADALYPV 
\VSAAS ICAKVARDQAVKKWQFVEKLQDLDTDYG\SGYPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 
L 


6146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVLTDEQKS 
R / YPGHEAHDQGG\WDARQS I IRKWDPETGRTRLI KGDGEVLE 
EI VTKERHRE INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEflPfifibAPfiJiJRRKAHGMLKLYYGLSE 
GEAAGRPAGPOPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDMWQIRALDSDMQTLVYENYNKFISATDTIRKMKNDFRKME 
DEMDRLATNMAVITDFSAR I SATLQDRHERITKLAGVHALIiRKL 
QFL FEL PSRLTKC VELGAYGQAVR YQGRAQAVIiQQ YQHLPS FRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEQAECVELLLALGEPA 
EELCEEFLAHARGRLEKELRNLEAELGPSPPAPDVLEFTDHG\S 
SG FVGGLCQ VAAAYQELFAAQG PAGAEKLAAFARQLGSRYFALV 
ERRIiAQEQGGGDNSLiLVRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERLGHHLQGLRAAFLGCLTDVRQALAAPRVAGKEGP 
GLAELLANVASSILSHIKASLAAVHLFTAKEVSFSNKPYFRGEF 
CSQGVREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLLS 
RLCLDYETATIS YILTLTDEQFLVQDQFPVTPVS TLCAEARE TA 
RRliLTHYVKVQGLVI SQMLRKS VETRDWLSTLE PRNVRAVMKRV 
VEDTTAIDVQVLPRJUAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDM C I WASHGAS S VARAS VR EPQGNKS PRMNTKRAGECLCPRS 
CS FSAQD YDI FAP I L P VEKQRLRVTQE VRAGI<VL VLK I RPQTNS 
CILPLPHSTGSINSDHVPTK 


6148 


305* 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDIiDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPLALPVRT 
SYKAFSTRWNWNEWLKLPVKYPDLPRNAQVAIiTIWDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSQMDQKPTKTPGRT 
SSTLSEDQMSRLAKLTKAHRQGHMVKVDWLDRLTFREIEMINES 
VKR S SN FM YLMGG FRCVKCDD K E YG I VY YEKDGDES S P I LTS FE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNIVSYPPSKPPTYEEQDLVWEFRYYLTNQDKALTKILTSVIW 
DLPQGAKQALALLGKWKPMDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCTFLISRASKNSTLANYLYWYVIVECEDQDTQQRDPK 
THEMYLNVMRRFSQALLKGDKSVRVMRSIiAAQQTFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKMNLSDVELIPLPLEPQVK 
IRGIIPETATLFKSALMPAQLFFKTEDGGKYPV1FKHGDDLRQD 
QLILQIISLMDKLLRKENLDLKLTPYKVLATSTKHGFMQFIQSV 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 

to FfT<?h 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C« Cysteine, D=Aspartic Acid, E=s 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S-Serine, T= Threonine, VaValine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVAEVLDTEGSIQNFFRKYAPSENGPNGISAEVMDTYVKSCAGY 
CVITYILGVGDRHLDNLLLTKTGKLPHIDFGYILGRDPKPLPPP 
MKLNKEMVEGMGGTQSEQYQEFRKQCYTAPLHLRRYSNLILNLF 
SLMVDANIPDIALEPDKTVKKVQDKFRLDLSDSEAVHYMQSLID 
ESVHALFAAWEQIHKFAQYWRK 


6149 


l 


1413 


RVDPRVRENGTANPIKNGKTSPASKDQRTGKKTSVQGQVQKGND" 
ESESDFESDPPSPKSSEEEEQDDEEVLQGEQGDFNDDDTEPENL 
GHRPLLMDSEDEEEEEKHSSDSDYEQAKAKYSDMSSVYRDRSGS 
GPTQDLNTI IjLTSAQLS S DVA VETPKQE FD VFGAVPFFAVRAQQ 
PQQEKNEKNIiPQHRFPAAGLEQEEFDVFTKAPFSKKVNVQECHA 
VGPEAHTIPGYPKSVDVFGSTPFQPFLTSTSXSESNEDLFGLVP 
FDEITGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
S TKKTLKP T YRTPERARRHKKVGRRDSQSSNE FLT I S DS KEN I S 
VALTDGKDRGNVLQPE ES LU3PFGAKPFHS PD\ LS WHPP\HQGL 
S \ D I RADHNT \ VLPGR \ PRQNSLHGSFHS ADVL KMDD FGAVP / F 
LTELWQS ITPHQSQQSQPV\ELDPFGAAPFPS kq 


6150 


372 


37 


MSNIKKYIIDYDWKASlEIEtDHDVMfEEKiHQINNFWSDSEYR 
LNKHGS VLNAVLI MLAQHALL I A I S SDLNAYGWCEFDWNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF 1 


615.1 
~*152 


1555 


521 


DSNQQSVSGTAASTLLHS FKATI YYQGTGHVQQF YGVTS PYSQT 
TPPIVQSYAOPSLQYIQGQQIFTAHPQGVWQPAAAVn-IVAPG 
QPQ PLQPSEMWTNNLLDIiPPPS PPKPKTI VLPPNWKTARDPEG 
KIYYYHVITRQTQWDPPTWBSPGDDASLEHEAEMDLGTPTYDEN 
PMK\ASKKPKTAEADTSSELAKKSKEVFRKEMSQFIVQCLNPYR 
KPDCKVG\RITTTEDFKHLARKLTHGVMNKELKYCKNPE\DLEC 
NENVKHKTKEYI KKYMQKFGAVYKPKEDTBFRVTVGPGWEDGWS 
GKTDS R ERKS CG PFCSTP VS TVLLM I HHPGB FNPADVN 




1366 


648 


NRTWSTPSTWMGVAIipPLCSTGPWPVTRQITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDEIiDC 
GPSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 
KCEHH CPCDPKTGNCS VSR VKQCLQ P PEATLRAG ELSFFTRTAW 

IiALTLALAFLLLISTAANLSLLLSRAERNRRLHGDYAYHPLQEM 
NGE P LAAE KEQPGGAHNP FKD 


6153 


2 


33^8 


GRVGARS PGRAYALLLLL I CFNVGSGLHLQVLS TRNENKLLP KH 
PHLVRQ KRAW I TAP VALLEGEDLSKKNP I AKIHS DLAEERGLKI 
TYKYTGKG I TE P P FG I FVFN KDTGELNVTS IIjDREETPF FLLTG 
YALDARGNNVEKPLELRIKVLDINDNEPVFTQDVFVGSVEELSA 
AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT 
GE I YTTS VTLDREEHSS YTLTVEARDGNGE VTDKPVKQAQVQ IR 
I LDVNDN I P WENKVLEGM VE ENQ VNVE VTRI KVPD ADE I GS DN 
WLANFTFASGNEGGYFHIETDAQTNEGIVTLIKEVDYEEMKNLD 
FSVIVAKKAAFHKSIRSKYKPTPIPIKVKVKNVKEGIHFKSSVI 
SIYVS ESMDRSSKGQI IGNFQAFDEDTGLPAHAR YVKLEDRDNW 
I S VDS VTSE I KLAKL PDFES R YVQNGTYTVKI VAI S ED YPR KT I 
TGTVL1NVEDINDNCPTLIEPVQTICHDAEYVNVTAEDLDGHPN 

SGPFSFSVIDKPPGMAEKWKIARQESTSVLLQQSEKKLGRSE IQ 
FIiISDNOGFSCPEKOVLTT.TVrRVT.HrtQ \ nmvt^nurkCVXTnr f~>r\ 

AAIALMI LAFLLLLLVPLLLLMCHCGKGAKGFTP I PGTIEMLHP 
WNNEGAPPEDKWPSFLPVDQGGSLVGRNGVGGMAKEATMKGSS 
SASIVKGQHEMSEMDGRWEEHRSLLSGRATQFTGATGAI\MTTE 
TTITARATGASRD VAGAQAAAVALNBE FLKN Y FTDKAAS YTEED 
ENHTAKDCLLVYSQEETESLNAS IGCCSFIEGELDDRFLDDLGL 
KFKTLAEVCIiGQKIDINKEIEQRQKPATETSMNTASHSLCEQTM 
VNSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSRQAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVlLGPSQPQSIilV 
TERVYAPASTLVDQPYANEGTVWTERVIQPHGGGSNPLEGTQH 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleoh i 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment, containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








LQDVPYVMVRER2SFLAPSSGVQPTIiAMPNIAVGQNVTVT2RVL 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPLPDFGLESSGH 
SNST I TTS STRVTKHS TVQHS YS 


6154 


3*60 


2146 


KKKT KM KNTLQKTVN FGAW P KP'T I SDKSHLLQMVS KLDLTDAKN 
SDTAHI KS IBITSILNGLQASESSAEDSEQBDERGAQDMDNNGK 
EES KI DHLTNNRNDL I S KEEQNS S S LLEEKKVHADLVI S KPVSK 
SPERIiRKDIEVLSEDTDYEEDEVTKKRKDVKKDTTDKSSKPQIK 
RG KR R YCNTEECIjKTGS PGK KEEKAXNKES LCMENSSNSS S DED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFS EVAEKR I KLL 
NNSDERLQNSRAKDRKDVWSSIQGQWPKKTLKELFSDSDTEAAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 
EEKTVEVNDRKAEFPSSGSNFSA* IPLPYLHLNRLHQSL *QKGS 
RQQSS VTVSBPLAPNQEEVRSIKSETDSTIBVDSVAGELQDIjQS 
ERE * LASRF* CQCELEQ * * S ARTRTS * KSLYRSEKSERCSGRRK 
F I K KABKKP * SNS GKQQ KEG K 


615S 


869 


121 


HLLPELRGKS WITMKYVF YLGVLAGTFFFADSS VQKEDPA P YLV 
YLKSHFNPCVQVLIKPSWVLAPAHCYLPNLKVMLGNFKSRVRDG 
TEQTINPIQIVRYWNYSHSAPQDDLMLIKLAKPAMLNPKVQALN 
P\PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSHRNSLCVKFVKVFSRIFGEVAVATVICKDKIjQGIE 
VGHFMGGDVGI YTNVYKYVSWI ENTAKDK 


6X56 


5725 


3984 


GTSTVTMATKKHFSIILNLLGMLLKKDNQDTRKL^TWALEVAV 
VMKKS ET YAP LFCL PS FH KFCKGLLADTLVEDVN I CLQACS S LH 
ALSSSLPDDLLQRCVDVCRVQLVHRGTCIRQAFGKLLKS I PLGV 
FLSNNNHTEIQE I SLALRSHMSKAPSNTFHPQDFSD/ VI S FI LY 
GNSHRTGKD^LERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 
AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGIZRSLAGHTLNPDQ : 
DVSQWTTADNDEGHGNNQLRLVLLLQYLENLEKIjMYNAYEGCAN i 
ALTS PPKVIRTFL YTNRQTCQDWLTR IRLS IMRVGLLAGQPAVT i 
VRHGFDLLTEMKTTSLSQGNELEVS IMMWEALCELHCPEAIQG 
I AVWSSS I VGKHLLWINS VAQQAEGRFEKASVE YQEHLCAMTGV 
DCCISSFDKSVLTLASAGCKSASLKHCLNGESRKSVLSKPTDSS 
PEVINYLGNKACECYISTADWAAVQEWQNAIHDLKKSTSSTSLN 
LKADFNYI KSLSS FESGKFVECTEQLELLPGENINIjIAGGSKEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPS YGLSREVQEKI EQKYDADLENKLVDWI I LQCAEDI EH 
PPPGRAHFQKWLMDGTVLCKLINSLYPPGQEPIPKISESKMAFK 
QMEQ I SQ FLKAAET YGVRTTD I FQTVDLWEGKDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGM TG YGM PRQ I M* DAAS CP 


<Jl58 


441 


1482 


LGSLIVLSLHCKVIFSSQSLERAMKBKAVDLVPILAQNPGLAQN 
PILEGKDHNQNTGVDPIIDHVQDRKTD/SRSKSPHKKRSKSRER 
RKSRSRSHSRDKRKDTREKIKEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKEREKDREKD KEKDREREREKEHE KDRDKEKEKE 
QDKEKEREKDRSKEIDEKRXKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRSSSRSPRTSKTIKRKSSRSPSPRSRNKKDKKREKBRD 
HISERRERBRSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
SSVSKEVDDKDAPRTEENKIQHNGNCQLNEENLSTKTEAV 


6159 


53 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
I ITGTP IIjTFVKDPQLEVNFYTGMDEDSDIAFQFRLHFGHPAIM 

nscvfgiwryeekcyylpfedgkpfelciyvrhkeykvmvngqr 
iynfahrfppasvkmlqvfrdisltrvlisd*grcvritavqef 
dvsvscdcttayqpg 


6160 


1626 


1790 


agakffp* f*kvadaqptesekeiynqvnwlkdaegiledi j qs ' 
yrgagheireaiqhpadeklqekawgawplvgklkkfyefsqr 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticle""* 
(AaAlanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Iiysine, 
L=Leucine, M=Methionine, N^Asparagine. 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, XcUnknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLGALTST P YS PTQHLEREQALAKQFAJE I LHFTLRFD 
ELKMTNPAIQNDFSYYRRTLSRMR1NNVPAEGENEVNNELANRM 
SLFYAEATPMLKTLSDATTKFVSENKNLP I ENTTDCLSTMASVC 
RVMLETPE YRSRFTNEETVS FCLRVMVGVI I LYDEVHP VGAFAK 
TSK I DMKG CI KVLKDQ P PNS VEG LLNALRYTTKHLWETTSKQI 
KSMLQ*QLLTLVNKG 


6161 




1569 


PVSGSESSLRRAWAS I LRLMLGPRVAVS ILCEDGISH* LLEKH* 
KS HVLE PLS S LALEEQC LALS LDWS TGKTGRAGDQPLK 1 1 S SD S 
TGQLHLLMWETRPRLQKVASWQAHQFEAWIAAFNYWHPEIVYS 
GGDDGLLRG WDTR VPG KFLFTS KRHTMG VCS I QS S PHRE H ILAT 
GS YDEHI LLWDTRNMKQ PLADTPVQGGVWR I KWHP FHHH LLXiAA 
CMHSGFKILNCQKAMEERQEATVLTSHTLPDSLVYGADWSWLLF 
RSLQRAPSWSFPSNLGTKTADLKGASELPTPCMECREDNDGEGH 
ARPQSGMKP LTEGMRKNG TWLQAT AATTRDCG VNPEEAD S AFS L 
LATCS FYDHALHLWEWEGN 


- tfies 


1 


586 


RTIHATGRAGAS PMHRLIVWRLAEANKQHVRCQKCLEFGHWTYE 
CTG KRKYLHRP SRTAELKKALKE KEN RLLLQQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKEIELLHSYWTDGLKTLM 


6163 


1081 




RIRSTTEGCAVRLHPTQNTGKARIMILLSVSLGRHWAFTYKFFL 
TPWFVFFFPFFHRKE * VMQKNPMKSREDEWMEKLNNLHVQRAD 
MNRLIMNYLVTEGFKEAAEKFRMBSG IEPSVDLETLDERIKI RE 
MILKGQIQEAIAIiINSLHPELLDTNRYIjYFHLQQQHLIELIRQR 
ETEAALEFAQTQLAEQGEESRECLTEMERTLALLAFDSPEESPF 
GDLLHTMQRQKVWS EVNQAVLDY ENRE S T PKLAKLLKLLLWAQNf 
ELDQKKVKYPKMTDLSKGVIEEPK 


6164 




40* 


PCQSPGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTILSNVLKKR 
SC 1 S RTAPRLLCTLBPG VDTKLKFTLEP SLGQNGFQQWYDAL KA 
VARLSTGI PKEWRRKVWIiTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDSMGI QI VKDLHRTGCS S YCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCQGFN I LAAL I LB VM EGNEGDALK I M I Y L I DKVLP ES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
G YEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVPFEGSE I IL 
RVSLAI WAKLGEQ I ECCETADE FYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPFPFPQIiAELREKYTYNI TPFPATVKPTS VSGRHS 
KARDSDEENDPDDEDAVVNAVGCLGPFSGFLAPELQKYQKQIKE 
PNEEQSLRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQ YSR I KKKQQQQVHQVY I RADKG P VTS I LP SQ VNSS PVIN 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTMGLGAA 
EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 
VQAKLGALELNQRDAAAETE LRVHPP CQRHC P EP PS APE ENKAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HF PQMSRSFS KPGGGN S GP * KM VFS SGTMLS RQLPGY PQ EYQRN 
GGERFG 


6165 


90 


406 


PCQS PGRSRMRQDKLTGSLRRGGRCLKRQGGGVGT I tSN VLKKR - 
SCI SRTAPRLLCTLEPGVDTKLKFTLEPS LGQNGFQQWYDALKA 
VARLSTG I P KE WRRKVW LTLADHYLHS I AI DWDKTMRFTFNER S 
NPDDDSMGIQ I VKDLHRTGCS S YCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCX^FWILAALILEVMEGNEGDALKIMlYLIDiCVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGEQI ECCETADE FYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDSDEENDPDDEDAWNAVGCLGPFSGFLAPELQKYQKQIKE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C*Cysteine, D=Aspartic Acid, E=> ' 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H^Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glut amine, R»Arginine, 
S«=Serine, T=Threonine, V^Valine, 
W^Tryptophan, Y»Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNE EQS LRSNN I AEL S PGA I NSCR S B YHAAFNS MMMKRMTTD I N 
ALKRQYSRI KKKQQQQVHQVYIRADKGPVTSILPSQVNSS PVIN 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDBQNEASKTNGLGAA 
EAFPSGCTATAGREGSS PEGSTRRTT EGQSPEPVPGDADVDVSA 
VQAKLGALELNQRDAAAETE LRVHP PCQRHCPE P PS APEENKAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP * KMVFSSGTMLSROLPGYPQE YQRN 
GGERFG 


6166 


2 


1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEViCRVLYGKELRK 
LDLPREAFEAASREDFELQGYAFEAAEEQLRRPRIVHVGLVQNR 
I PLPANAP VAEQVS ALHRR I KAI VE VAAMCGVN I ICFQEAWTMP 
FAFCTREKLPWTEFAESAEDGPTTRFCQKLAKNHDMVVVS P ILE 
RDS EHGDVLWNTAWISNSGAVLGKTRKNHI PRVGDFNESTYYM 
EGNI^HPVFQTQFGRIAVNICYGRHHPLNWLMYSINGAEIIFNP 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFGYFYGSSYVAAPDSSRTPGLSRSRDGLLVAKLDI, 
NLCQQVNDVWNFKMTGRYEMYARELAEAVKSNYS PTI VKE* PAS 
VPALG 


" 6167- '' 


1220 


1B44 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEAIjCACEEYLSN 
LAHMDIDKDLEAPLYLTPEGWSLFLQRYYQVVHEGAELRHIjDTQ 

vqrcedilqqlqawpqidmegdrniwivkpgaxsrgrgimcmd 

HLEEMLKLVNGNPWMKDGKWWQKYIBRPLLIFGTKFDLRQWF 

lvtdwnpltvwfyrds y I rfstqpfs lknldk*aplyltpegws 

LFLQRY YQWHEGAE LRHLDTQVQRCED I LQQLQAWPQIDM EG 
DRNI WI VKPGAKS RGRG IMCMDHLBEMLKLVNGNPWMKDGKWV 
VQKYI ERPLL1 FGTKFDLRQWFLVTDWNPLTVWFYRDSY IRFST 
QPFSLKNLDK 


6168 


B4 


1392 


VW P VPS VSAMP P KKQAQAGGS KKAEQ KKKEK 1 1 EDKT FGLKN KK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 
ELNELFKPWAAQKISKGADPKSWCAFFKQGQCTKGDKCKFSH 
DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEWNKKHG 
E AE KKKPKTQ I VCKHFLEAI ENNKYGW F WVCPGGGD I CM YRHAL 
P PGFVLKKKKKKKKKEDE IS L* DLIERERSALGPNVTKI TLES F 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDEVPDSVSVNDIDIiSLYIPRDVD 
ETGI TVASLER FS T YTSDKDENKLS EASGGRAENGERSDLEEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEELNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVITRllKEALPDGVNlSKEARSAISR 
AAS VFVLYATS CANN FAMKG KRKTLNAS D Vl»S AM EEME FQR FVT 
PLKEALEAYRREQKGKKEASEQKKKDKDKKTDSEEQDKSRDBDN 

DEDEBRLEEEEQNEEEEVDN*KGRETVAPWKVPLEMRRATCFCB 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVFITGASRGIGKAIALKAAKDGANIVIA 
AKTAQ PH PK LLGT I YTAAE E I EAVGGKALPC I VDVRDEQQ I S AA 
VEKAI KKFGG IDI LVNNAS AI S LTNTLDTPTKRLDLMMNVNTRG 
i iiaAoruftL-j.^xLAlvoi\.VAHI PNISPPLNI^PVWFKQHCGRW'* VV 
G*GDGLCLICFEIjNLCMSDVITICT 


6171 


382 


941 


HFMQSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAYFPRVLKQVHQALSLSQEAVSVMDSMVRD ILD 
R IATEAGHLAH YS KC VTI TSRD IRMAVCLLLPGKMGXLAESQGT 
NATLRYTKSK 


6i72 ■ 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARLRREYLYRKAREEAQR 
SAQERKERLRRALEENRLIPTELRREALALQGSLSFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA 
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Amino acid segment containing signal peptide 
(Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I»lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glu taurine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, ^stop 
Codon, /.possible nucleotide deletion, 
\«possible nucleotide insertion) 








QRMNRGRHE VGAL VRACKANG VTDLLVVHEHRGTP VGL I VS HLP " ' 

FGPTAYFTLCNVVMRHDIPDLGTMSEAKPHLITHGFSSRLGKRV 

SDILRYLFPVPKDDSHRVITFANQDDYISFRHHVYKKTDHRNVE 

LTEVGPRFELKLYMIRLGTLEQEATADVEWRWHPYTNTARKRVF 

LSTE*AAPRPLGQLL 


6173 


3 


288 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
LQHQR I HTGE KP YVCS VCGKAFSQSS VLS KHRTIHTGEKP YE CN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
R1HTGERPYVCPLCGKAFNHSTVLRSHQRVHTGEKPHRCNECGK 
T FS VKRTLLQHQR IHTGEKPYTCS ECG KAFSDRS VL I QHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHLIQHQKVHRKL* PTCVLSVGSALAGVPTS FS ISVSTLERSP 
MCAVYVGRPSARAQSLVNTGQFTQVR3PMSVMSVEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLALAPLGDAQLVLLRPRRLMNANGRSVARAAELFGL 
TAEEVYLVHDELDKPLGRLALKLGGSARGHNGVRSCISCLNSNA 
MPRLRVGIGRPAHPEAVQAHVLGCFSPAEQELLPLLLDRATDLI 
LDHIRERSQGPSLGP*H*WFSKKA 


6175"- 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPLRAMAAPVKGNRKQSTEG " 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 
AFLVSLYKFMKERHTPIERVPHIiGFKQINLWKIYKAVEKLGAYE 
LVTGRRLWKNVYNELGGS PGSTSGATCTRRH Y * RLVLPYVRHLK 
GBDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQl^ 
MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
PEAYKRLLS S FYCKGTHGIMSPLAKKKLLAQVSKVEALQCQEEG 
CRHGAEPQASPAVHLPESPQSPKGLTENSRHRIjTPQEGLQAPGG 
SLREEAQAGPCPAAPIFKGCFYTHPTEVIjKPVSQHPRDFFSRLK 
DGVLLGPPGKEGLS VKEPQLVWGGDANRPSAFHKGGSRKG I LYP 
KPKACWVSPMAKVPAESPTLPPTFPSSPGLGSKRSLEEBGAAHS 
GKRLRAVS PFL KEADAXKCGAKPAGSGLVS CLLGPAI/3PVP PEA 
YRGTMLHCPLN FTGT PGPLKGQAAL P FS PLVI PAFPAHFLATAG 

PSPMAAGLMHFPPTSFDSALRHRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIGQIIGASGFSESSLFCKWGIHTGAAWKLIjS 
GVR BGQTQVDTPQ I GDMAYWS HP I DLHFATKGLQG WPRLHFQVW 
SQDS FGRCQIAG YGFCHVPS S PGTHQLACPTKRPLGSWREQLAR 
AFVGGGPQLLHGDTIYSGADRYRtiHTAAGGTVHLEIGLLLRNFD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCSKGKRQPHKSLHDRCFGEALDPN ' 
GSHC YLDQ I KRS D FLGFSG YS PHF VAI S TNS EHKMQPSSMQQA1* 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGG 1 KG VARAAS LVGRRRAGTGMAL LLCLVCLTAALAHG CI* 
HCHSNFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 
LHLAI PAKITREKLDQVATAVYQMMDQL YQGKM YFPG YFPNELR 
NI FREQVHLIQNAI I ESRIDCQHRCGI FQYETISCNNCTDSHVA 
CFGYNCE SS AQWKSAVQGLLN Y INNWH KQDTSMRPRS S AFS WPG 
THRAAPAFL VL PALR CLEP PHLANLS LEDAA* CL KQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRL1GIRLTL 
P PPKWDRWNE KRAMFG VYDNI GI LGNFEKH PKEL IRGPI WLR G 
WKGNELQRCIRKRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR*KRKLRTSEKAHLSPWRRETVLFPVRKRLCI FS VI KWGFFGI 


6180 


156 


1833 


DHHILKAASTTHVCARGN 1 FAI PNTRCLEC+ ATATPSSLECQN * 
SHLSLCPliPATTSGIiTPNSMIPEKERQNIAERLLRVMCADLGAL 
S WSGKE FL KLAQTLVDSGAR YGAFS VTE ILGNFNTLlALKHLPR 
M YNQVKVKV TCALG SNACLG I GVTCHS QS VG PDS CY I LTA YQAE 
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sequence 


Predicted end 
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Amino acid segment containing signal peptide 
(A»Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








GNH I KS Y VLG VKGAD I RJ3SGDLVHHWVQN VLSEF VMS E IR TVYV " 

TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 

V I E L LNVC ED LAG S TGLAKETFGS LEETS P P PCWNS VTDSLLL V 

HERYEQICEFYSRAKKMNLIQSLNKHLLSNLAAILTPVKQAVIE 

LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 

KENFKVHPAHKVAMILDPQQKLRPVPPYQHEEriGKVCELINEV 

KESWAEEADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 

PLFGATPDLFQYWSCVTQKHTKLAKtAFV?LLAVPAVGARSGCVlJ 

MCEQALLIKRRRLLSPEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLQPRYRKNAYLFI 
YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSLLELLHI YVGIBSNHLLPRFIjQLTERI I ILFWITSQEEVQE 
KYWCVLFVFWNHiDMVRYTYSMLSVIGISYAVLTWLSQTLWMP 
I Y P LCVLAE AF AI YQSLP YF ES FGT YSTKLPFDLS I YFP YVLK I 
YLMMLFI GM YFTYSHL YS ERRD I LGI F P I KKKKM * S TAFQCDTR 
KDRLW1QCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS * I DYQLNTLLKEFQLTEENTKLRYLTCSLI EDMAAAYFPDCI " 
VRP FGSS VNT FGKLG CDLDM FLDLDETRNLS AH K I SGN FLME FQ 
VKNVP S ERI ATQK I LS VLGECLDH FG PGCVGVQKI LNARCPLVR 
FSHQASGFQCDLTTNNRIALTSSELLYIYGAIjDSRVRAIiVFSVR 
CWARAHS tiTS S I PGAW I TNFSLTMMVI FFLORRSPP IL PTLDSL 
KTLADAEDKCVI EGNNCT FVRDLS R I KPS QNTETLELL L KB F FE 
YFGNFAFDKNS INIRQGREQNKPDSS PL Y IQNP FETSLN I SKNV 
SQSQLQKFVDLARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 
APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 
STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCGGCX33SCGGCGSGCGGOGSGRGGCGSGCGGCSSSCGGCGS 
RC YVPVCCCKP VCSW VPACS CTSCGS CGGSKGGCGSCGGS KGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCLVXAFLM 
VP 


6184 


1 




IVTVREEDGAPAVAPPGVWSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVAOSFDRRFFPISKDQPRVLLPLANVALIDYTIiE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRIITS 
EIiYRSLGDVLRDVDAKALVRSDFLLVYGDVISNINITRALEEHR 
LRRKL* KNVSVMTMI FKESS PSHPTRCHEDNWVAVDSTTNRVL 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVA 
QLFTDN FD YQTRDD FVRGLLVNEE I LGNQ I HMHVTAKE YGARVS 
NLHMYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRG 
PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NVVLI^rYLWQGVRVAAGAQIHQSIiLCDNAEVKERVTLKPRSVL 
TSQWVGPNITLPEGSVISLHPPDAEEDEDDGEFSDDSGADQEK 
DKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELQQNLWGLKI 
NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 
GKEENISCDNLVLEINSLKYAYNISLKEVMQVLSHWLEFPLQQ 
MDSPLDSSRYCALLLPLLKAWSPVFRNYIKRAADHLEALAAIED 
PFLEHEA^JGIS^4AKVIiMAFYQLEILAEETILSWFSQRI)TTDKGQ 
QLRKNQQLQRFIQWLXEAEEESSEDD 


(ilBS - 


791 


44 


P CTS CV L WATLHL PAS XR KAP QAE CGM I S I TEWQ KI GVGITG FG " 
IFFILFGTLLYFDSVLLAFGNLLPLTGLSLUGLRKTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMS SLNL DHWLKGAK 
REEWEP P PQS PALTHS PTYPGPPQVQKERNGAEQLTSNPQVDS R 
GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 


569 


238 


VYGIDSSNTNTHGAEERNRKLKKHWKLCHAQSRLDVNGLALKMA 
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amino acid 
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Amino acid segment containing signal peptide " 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F* Phenyl alanine , G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine , R=Arginine, 
S«=Serine, T=Threonine, V=Valine, 
w*Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ a ^ua&i«i,c nutAcuciae insertion/ 








KERKVKNKVKNKADTEEVFNNSPTNQEKMPTSAILPDFSGSVI S 
NI RNQMETLHSQ PHQE ENLCFENS FSL INLLP INAVE ?TSS QQ t 
PNRETS EANKERKKMTSKSSESNI YS PLTS FITADS ELHD 1 1 KD 
LE DCLMVG LHTCGDLA PNTLRI FTSNS E I KGVCSVGCCYHLLS B 
E FENQHKE RTQE KWG F PMCH YLKE ERWCCG RNARMS ACLALERV 
AAGQG L PTE S LFYRAVLQD1 1 KDCYG I TKC DRHVGKI YS KCS S P 
LDYVRRSLKKLGLDESKLPEKI IMNYYEKYKPRMNELEAFNMLK 
v ViiAJr'^ it, 1 IjIL-LDRIjCXIjKEQEDIAWSALVXLFDPYKSPRCYA 
VIALKKQQ*FPLKQIIRCISL*DSAGCAEEVSVGDGGPALRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDS FIE PRPGRLPE LEATRPHMEPKASCPA " 
AAPLMERKFHVLVGVTGSVAALKLPLLVSKLLDIPGLEVAWTT 
ERAKHFYS PQDI PVTLYSDADEWEMWKSRSD PVLH IDLRRWADL 
LLVAPLDANTLGKVASGICDNLLTCVMRAVn)RSKPLLFCPAMNT 
AMWEHPI TAQQVDQLKAFGYVE I PCVAKKLVCGDEGLGAMAEVG 
TIVDBCVKEVLFQHSGFQQS+PGISVJ^GVPLYSEWVQAKSVKMDV 

GKIGGYPHLLNGGPALSLPRGQACSRLNWTEGPGLSFFQPGEAA 
A 


6188 


238 


1534 


KGFVMAGPLMAELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
N1GVFICIRCAG1HRNLGVHISRVKSVNLDQWTQEQIQCMQEMG 
NGKANRLYEAYLPETFRRPQIJDPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWFEKVKMPQKKEDPQLP 
RKSSPKSTAPVMDLLGLDAPVACSIANSKTSNTLEKDLDLLASV 
PSPSSSGSRKWGSMPTAGSAGSVPSNLNLFPEPGSKSEEIGKK 
QIjSKDSILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
PNS I MGSMMPPP VGMVAQPGASGMVAPMAMPAG YMGGMQASMMG 
VPNGMMTTQQAGYl^GMAAMPQTVYGVQPAQQLQWNLTQMTQQM 
AGMKF F YGANGMMNYGQSMSGGNEQAANQTLS p qmwk 


5189 


129^ 


793 


LGE PLGDLCELI PGDVQQLQMGE VH PGTGAQGSAAQS VAG3 VQL 
TQLSHARQRPSCQGSQLIALDLQHMDISRQPRWQHVQPVAROVQ 
RAQQAQLAEGVAVHLWAGDAWAEVELLQEVGGGKVFAANACDL 
WQDHEGAjHAARQATGHALQRVIVQVRRVQPLEAL*RVPSGLPR 
RVRAFMILHNQITGIGREDFATTYFLEELNLSYNRITSPQVHRD 
AFRKLRLLRSLDI^GNRiHMLPPGLPRlJVHVLKVTCRNEIJlAtAR 
GALAGMAQLRELYLTSNRLRSRALGPRAWVDLAHIiQLLDIAGNQ 
LTEIPEGLPESLEYLYLQNNKISAVPANAFDSTPNLKGIFLRFN 
KLAVG S WDS AFRRL KHLQVLDI EGNL EFGD I S KDRGRLGKEKE 
EF F K n P VP P PPTU 


6190 


66 


1309 


ILVGNVSFLLSFAEYVCNCSWGSLNVNRCNQTTGQCECRPGYO 
GLHC E TCKEG F YLN YTSGLCQ PCDCS PHGALS I PCNSSGKCQCK 

CQENS KGNHCE E CKEGFYQS PDAT KECLRCP CS AVTSTGS CS I K 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDSICRKCQCHGHVY 
PVKTPKICKPESGECINCLHNTTGFWCENCL*GYVHDLEGNCIK 
KVILPTPEGSTILVSNASLTTSVPTPVlNSTFTPTTLCn'IFSVS 
TSENSTSALADVS WTQFNI I ILTVI 1 I WVLLMG FVGAVYM YRE 
YQNRKLNAPFWTIELKEDNISFSSYHDS I PNADVSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1*11 - ■ 


VNLCHGGLLliLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
MIDWIKKIWYIYTMEYYATIKRNEIMFFAGTWMEMEAIILSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKWVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSBRVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEFRNPS IYEKLI 
QFCAI DELGTNYPKDMFDPHGWSEDS YYEALAKAQK I EMDKLEK 
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Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I«Isoleucine, K=»Lysine, 
L=Leucine, ^-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKI EFVTQTKKGTTTNAT^T'TTTTii.^TavannnirD ire i/w 
DSAI P VTT I AQ PT I LTTTATL PAWTVTTSAS GS KTTVI SAVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGN KMAG KKNVLS S LAVYAEDS 3 PESDG EAG I EAVGS AAEB 
KGGLVSDAYGEDDFSRLGGDEDGYE3EEDENSRQSEDDDS3TEK 
PEADDPKDNTEAEKRDPQELYASPSERVRKMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYBRKIKEGMDMNYIIQRKKEFRNPSIYEKLI 
QFCAI DELG7NYPKDM FD PHG WS EDS YYEALAKAQ K I E M DKLEK 

AKKERTKI E F VTnTKfffXTTTN aTQ TTTTTa C T A \tt\ t \ v n v r» trtit 
v luiwvoi i iiv/ii o x l J. 1 1 Ai> JAV/4JJAUKRKSKW 

DSAI ? VTT I AQ PTI LTTTATLPA WT VTT SAS GS KXTV I SAVGT 
IVKKAKQ 


6194 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE ' 

PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKL YERKI KEGMDMNY 1 1 QRKKE F RNPS I YE KL I 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTKI EFVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
DSAI PVTTI AQP TI LTTTATLPAVVT VTTS ASGS KTTVI SAVGT 
IVKKAKQ 


6195 


736 


235 


v/*«v»i*yov>iri tivr x tu x cut x Lii'tiUH PS VRKTHCSGRKHKENVKD 
YYQKWMEEQAQS L I DKTTAAFQQGKI PPTPFSAPPPAGAMI PP P 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPapPMMRPPARPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILDNAEQVISNLEARNL3PRLTPLLQEEDSH 
QRLLMG LMVS EL KDHFLRHLQG VE KKK I EQMVLD Y I S KLLDL I C 
HIVETNWRKKNLHSWVLHFNSRGSAAEPAVFHIMTRILEATNSL 
* * VJL nixjiii. xi^vycij^jbitNJjjjiiLlOSoVLJLiLTETAVIRIi 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDMPMSSNIISR 
NHVTRLLQNYKKQ PRNSM I NKS S FS VE FLPLNY F I E I LTD I ES S 
NQALYPFEGHDNVDAEFVEEAALKHTAMLLGL 


6197 


3 


819 


ADPEGTE2AVMSRYTRPPWTSLFIRNVADATRPEDLRREFGRYG 
P I VD VYI PLDFYTRRPRGFAYVQFED VRDAEDALYNLNRXWVCG 
RQ1EIQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RS SS WGRNRRRS DS LKESRHRRFS YS Q S KS RSKS L PRRSTS ARQ 
SRTPRRN7GSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPSFISPACFLLRKLPALEDGTLPHPDTLGMNYEGARSE 
RENHAAD DS EGGALDMCCS ER I ■ PGLPQP I VM E ALDB AEGLQDSQ 

REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHIWSONATNI»V«!Qr.T,TT.T vnT.PDTJiitfi rsanrm 

GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEPPTPLPPEDRRQSV 
SRQPS FT YSEMMEEKI EDDFLDLDPVPETP VFDCVMDI KPEADP 
TS LTVKS MGLQERRG S WSLTL DM CTPGCNEEGFG YLMS PRE ES 
AREYLLSASRVLQAEELHEKALDPFLLQAEFFEIPMNFVDPKEY 
DIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINANYIRG 
YGGEEKVYIATQGPIVSTVADFWRMVWQEHTPIIVKITNIEEMN 
E KCTEY W PE EQ VAYDGVE I TVQ KVIHTED YRLR LI SLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAG IGRTGCF I ATS I CCQQLRQEG WD I LKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQS PE 


6199 


144 


1211 


MARENGESS5SWKKQAEDIKKIFEFKETL6'1"GAFSEVVLAEEKA 
TGKLFAVKCI PKKALKGKESS I ENE IAVLRKIKHENIVALEDI Y 
ES PNHLYLVMQL VSG GEL FDR I VE KG F YTE KDAS TLIRQVLDAV 
Y YLHRMG I VHRDLKP ENLLYYSQDEES KIM I S DFGLS KM EGKGD 
VMSTACGTPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPP 
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to first 
amino acid 
residue of 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, Cysteine, D=>Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine. Islsoleurinp (c-r.vcinp 
L= Leucine, M=Methionine, N=Asparagine, 
P=Prollne, Q=Glutaraine, R^Arginine, 
SoSerine, T=Threonine , V=Valine, 
WsTryptophan, Y=:Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotidp dRl^firm 
\=possible nucleotide insertion) 








FYDENDS KLFEQI LKAE YE FDS PYWDD I SDS AKDFI RNLME KDP 
N KRYTCEQAARH P Wl AGDTALNKNI HES VS AQI R KN FAKS KWRQ 
AFNATAWRHMRKLH LGSSLDSSNAS VSSS LSLASQKDCASG^F 
HAL* 


6200 


702 


96 


lpevphslrprvkphlccaqpavrvkarlpklavfdldytlwp'f"" 

WVDTHVD P P FHKSS DGTVRDRRGQDVRL YP E VPEVLKRLQS LG V 
PGAAAS RTS E I EGANOLLR TiFIYT. FR Y pvh p j? t vor c v r tu wo t 

qqktgipfsqmiffdderrnivdvsklgvtcihiqngmnlqtls 
qgletfakaqtgplrsslebs pfea 


6201 


2809 


23B3 


GQTPRVRtokMRRSLRAGKRRQTAGRKSkS PPKVPI VIQDDSLPA 
GPPPQIRILKRPTSKGWSSPNSTSRPTLPVKSLAQREAEYAEA 
RKRXLGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVAS S LLS R PTRKMAPQ KDR KPKRS TWRFNLD LTH P VE 

DG I FDSGN FEQ FLRE KVKVKGKTGNLGN VVH I ERF KNK I TWS E 

KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEUESESED 


6203 


419 


2^50 


RCPRPPATAGAAASRPDRSPPSGISGSEAAAGAQAAAPASQHPA ' 
TGTGAVQTEAMKQILGVIDKKLRNLEKKKGKIiDDYQERMNKGER 
LNQDQLDAVS KYQEVTNNLEFAKELQRS FMALSQDI Q KTI KKTA 
RREQU^REEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 
P I LS EEE LS LLDEFYKLVDPERDMS LRLNBQ YE HAS I HLWDLLE 
GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEEEEAA 
*"^ AV xs-UUV FisAEPE PAEE YTEQS EVESTE YVNRQFMAETQFTS 
G E KEQVDEWTVET VE WNS LQQQPQAAS PS VP E PHS LT P VAQ AD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTLDPA1VSAQPM 
NPTQNMDMPQLVCPPVHS ESRLAQ PNQ VPVQP EATQVP L VS STS 
EGYTASQPLYQPSHATEQRPQKEPIDQIQATISLNTDQTTASSS 
LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFKMNAPVP 
PVNEPETLKQQNQ YQAS YNQS FSSQVHQVEQTELQQEQLQTWG 
TYHGS PDQSHQVTGNHQQP PQQNTG F PRSNQ P YYNSRG VSRGGS ' 
RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 
RD YS GYQRDG YQQNFKRGSGQ SGP RGAPRGRGGP PRPNRGM PQM 
NTQQVN 


" 62£U 


2933 


787 


CTHNL I S LLGGRALXHFNR FLNLK I QEGE AHNI FCPAYDCFQL V 
PGDI I KS WS KEMDKR YLQ FD I KAF VENN PA1 KWCPTPGCDRAV 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWECLGEAHEP 
CDCQTWKNWLQKI TEMKPEELVGVS EAYEDAANCLWLLTNS KPC 
ANCKS P I QKNEG CNHMQCAXCK YDFCW I CLEEWKKHS FVH WE V I 

YRCTRYEVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHSYOLEORTjIjKT2VTrRVM]?rvT^CD tvt vL"ri?rnnnnT'm7TEtm<r 

*■ W ^ -"W ■* JI »--». r\IVE>iu"iI!i yoo*\/vLj]\..Ei L CAjU L Ir UT,ir IE DA V 

HVLLKTRRI LKCS YP YGFFLE P KSTKKE I FELMQTDLEMVTEDL 
AQKVNRPYLRTPRHKIIKAACLVQQKRQEFLASVARGVAPADSo 
EAPRRS FAGGTWDWE YLGFAS PBEYAEFQYRRRHRQRRRGDVHS 
LtiSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
S LRD YT PAS RS ENQDS LQALS S LOEDD PN I LLAI QLS LQES GLA 
LDBBTRDFLSNEASLGAIGTSLPSRLDSVPRNTDSPRAALSSSE 
LLELGDSLMRLGAENOPFSTDTLSSKPLSEARSDFCPSSSDPDS 
AGQD PN I NDNLLGN I MAWFHDMNPQS I AL I P PATTE I S ADS QL p 

C I KDGS EGVKDVELVLPEDSMFEDAS VSEGRGTQ1 EENPLEENI 
PGGGKQHPQAW 


6205 


1 


1200 


RAHRG KMAIjE VGDMEDGQLS DS DS DMTVA PSDR PLQL PKVLGGD ' 
SAMRAFQNTATACAP VSHYRAVES VDS S EES FSDS DDDS CLWKR 
KRQKCFNPPPKPEPFQFGQSSQKPPVAGGKKINNIWGAVLQEQN 
QDAVATELGI LGMEGTI DRS RQ S ETYNYLLAKKLRKESQEHTKD 
LDKBLDEYMHGGKKMGSKBEENGQGHLKRKRPVKDRLGNRPEMN 
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residue of 
amino acid 
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Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D«=Aspartic Acid, B= 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R»Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








YKGRYEITAEDSQEKVADEISFRLQEPKKDLIARWRIIGNKKA 
I3LLMETABVEQNGGLFIMNGSRRRTPGGVPLNLLKNTPSISEE 
QIKDIFYIENQKEYENKKAARKRRTQVLGKKMKQAIKSLNFQED 
DDTSR ETFASDTNEALAS LDE SQEGHAEAKLEAEEA I EVDHSHD 
LDIF 


6206 


10 


1442 


IISERRERSCLHLVCIRCSCDWEMGSVLGLCSMASWIPCLCGS 
APCLLCRCCPSGNNSTVTRLIYALFLLVGVCVACVMLIPGMEEQ 
LNKIPGFCENEKGWPCNILVGYKAVYRLCFGLAMFYLLLSLLM 
I KVKS S SDPRAAVHNGFW FFKFAAAI AI I IGAFFI PEGTFTTVW 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
ALLS ATALN YLLSLVAI VL FF VY YTH PAS C S EN KAF I S VNMLL C 
VG AS VMS I LPK IQE S QPRSGLLQS S V I TVYTM YLTW S AMTNE P E 
TNCNPSLLSI IGYNTTSTVPKEGQS VQWWHAQGI IGLILFLLCV 
FYS S I RTS NNS QVNKLTLTSDES TL I EDGGARSDGS LEDGDD VH 
RA VDNERDG VTYS Y S F FH FML FLASL Y I MMT LTNW Y R Y E PS RE M 
KSQWTAVWVKI SS S W IGI VLYVWTLVAPLVLTNRDFD 


6207 
S"20B 


2924 


1471 


7 VMAEAAT PGTTATTS GAGAAAATAAAAS PTPI PTVTAPSLGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSXPLKQEEATATELTTKSSLAA 
SSSLSS I VGPLVEMNTGEAES RNSNFATVGAGS EDWVNAIEFVP 
GQPYCGRTAPSCTEAPLQGSVTKE3SEKEQTAVETKKQLCPYAA 
VGE CRYGENCVYLHGDS CDMCGLQVLHP MDAAQRSQH I KS C I E A 
HEKJ3MELSFAVQRSKDMVCGICMEWYEKANPSERRFGILSNCN 
HTYCLKCIRKWRSAKQ FESKI I KS CPECRITSNFVI PS E YWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWBLIEERENSNPFDNDEEE 
W7FELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 




2924 


1471 


TVMAEAAT PGTTATTSGAGAAAATAAAAS PTP I PTVTA PS LG AG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
SSSLSSIVG PLVEMNTGE AES RNSNFAT VGAGSEDW VNAI EFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGECRYGENCVYLHGDS CDMCGLQVLHPMDAAQRSQHI KS CIEA 
KB KDMBLS FAVQRS KDM VCGI CME WYEKANPSERRFG I LSNCN 
HTYCLKCIRKWRSAKQFESKIIKSCPECRITSNFVTPSEYWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIBERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ER LCF P CMQS K I Y S YMS PNKCSGMR F PLQE ENS VTHHE VKCQGK 
PLAGIYRKREBKRNAGNAVRSAMKSEEQKIKDARKGPLVPFPNQ 
KSEAAEPPKTPPSSCDSTNAAIAKQALKKPIKGKQAPRKKAQGK 
TQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
KIDLIDGKGRGVIATKQFSRGDFWEYHGDLIEITDAKKREALY 
AQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 

TKLHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPWL 

KH ! 


6210 




-Jot 


i v VjM 5 lUiKM VIiLEDSGS AD FRRH FVNL S PFT I T WLLLS ACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S V I CNQLGCPTAI KAPGWANS S AGSGRI WMDHVS CRGNE SALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
IKFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHCGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLE VRFQGEWGTICDDG WDS YDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGS DLELRLRGGGSRCAGTVEVEI QRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTSYQVYSKIQATNTWL 
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Amxno acxd segment containing signal peptide 
(AoAlanine, C=Cy S teine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine f R=«Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FIiSS CNGNETS LWDCKNWQWGGLTCDH Y E £ AK I TCS AHRE PRLV 
GGDIPCSGRVEVKHGDTWGS1CDSDFSLEAASVLCRELQCGTW 
SILGGAHFGEGNGQIWAEEPQCEGHESHLSLCPVAPRPEGTCSH 
SRDVGWCS RYTE I RLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
IEDAHVLCQQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
D S WDLS DAHWCRQK3CGEA INATGSAH FGEGTGP I WLD EKKCN 
GKESR I WQ CHSHGWGQQNCRHKE DAGV I CS E FMS LRLTS E AS RE 
ACAGRLEVFYNG^WGTVGKSSMSETTVGVVCRQLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
W ITCDNKIRLQEG PTS CSGRVE I WHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSXjWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VGILGWLLAIFVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNSCLNADDLDLMNS SGGHSEPH 


62X1 

• 


3761 


387 


IFGMSKLRMVLLECSGSADFRRHFVNt^PFTtTWLLLSACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDIIV3CRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
I KFQGRWGT VCDDNFN I DHASVI CRQLECGS AVS FSGS SNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSLX3SDLELRLRGGGSRCAGTVEVEI0RL 
LG KVCDRGWGLKEADWCRQLGCGSALKTS YQVYS K IQATNTWL 
FLSSCNGNETSIiWDCKNWQWGGLTCDHYEEAKITCSAHREPRXiV 
GGDI PCSGRVEVKHGDTWGS ICDSDFSLEAASVLCRELQCGTW 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVGWCSRYTEIRIiVNGKTPCEGRVELKTLGAWGSLCNSHWD 
IEDAHVLCQQLKCG VALSTPGGARFG KGNGQ I WRHM FH CTGTEQ 
HMGDCPVTALGA5LCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPT I PEESAVACI ESGQLRLVNGGGRCAGRVEI YHEGSWGTI CD 
DSWDLSDAHWCRQI/3CGEAINATGSAH FGEGTGP I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
AS LDKAMSI PMWVDNVQCPKGPDTIjWQCPSSPWEKRLAS PS EET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VGILGWLLAIFVALFFI.TKKRRQRQRLAVSSRGENLVHQ1QYR 
EMNSCLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


LKWELRPGGAWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGI* 
G CSKPHLEKLTLG I TRI LES S PGVTE VTI I EKP PAERHM I S S WE 
QKNNCVMPEDVKNFYLMTNGFHMTWS VKLDEHI I PLGSMAI NS I 

V I FE LDS CNGSGKVCL VY KSGKP ALAEDTE I WFLDRALYWHFLT 
DTFTAY YRLLITHLGLPQWQYAFTS YGI S PQAKQRVSMYKP IT Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGIX3AEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVM PEDVKNFYI»MTNGFHMTWS VKLDEHI I PLGSMAXNS I 
S KLTQLTQS S MYSLPNAPTLADLEDDTHEASDDQ PEKPH FDS RS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D =» As part ic Acid, E= 
Glutamic Acid, F« Phenylalanine, G*Glycine, 
H^Histidine. I»lBoleurin»« K— t.uo { na 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, v=Valine, 
W»TrvDtoDhan# Y*=Tvrosine x-UnWnrKim + „ c t- 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








V I F ELDS CNGSG KVCL VY KSGKPAliAEDTE I W FLDRAL YWHFLT 
DTFTAYYRLL I THLG L PQ WQYAFTS YGISPQAKQRVSMYKP IT Y 
NTNLLTEETDS FVNKLDPS KVFKS KNKTVI PKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


KBLAPSAIRRAARLGLGPARWQSRAAAFYFVRGFRTGWS FVGWV 
VI^TSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYN 
YRTYAVRR IRDAFRENKNVKDPVE I QTLVNKAKRDLGVIRRQVH 
IGQLYSTDKLI IENRDMPRT 


6215 


2 


1849 


F VAGG PRGSGS AAETM P E I R VTPLG AGQD VGRS C I L VS I AGKWV 
MLDCGWHMGFNDDRRFPDFSYITQNGRt.TDFLDCVIISHFHLDH 
CGALPYFSEMVGYDGP I YMTHPTQAI C PI LL EDYRK IAVD KKGE 
ANFFTSQMI KD CMKKWAVH LHQTVQ VDDEX.E I KA YYAGHVLGA 
AMFQI KVGSES WYTGDYNMTPDRHLGAAWIDKCRPNLL ITEST 
YATTIRDSKRCRERDFIiKKVHETVERGGKVLIPVFALGRAQELC 
ILLETFWERMNLKVPI YFSTGLTEKANHYYKLFI P WTNQKIRKT 
FVQRNMFEFKHIKAFDRAFADNPGPMWFATPGMLHAGQSLQIF 
RKWAGNEKNMVI MPGYCVQGTVGHKI LSGQRKLEMEGRQVLEVK 
MQVEYMSFSAHADAKGI MQ LVGQAE PESVLLVHGEAKKMEFLKQ 
KIEQELRVNCYMPANGETVTLPTSPSIPVGISLGLLKREMAQGL 
LPEAKXPRLLHGTLIMKDSNFRLVSSEQALKELGLAEHQLRFTC 
RVHI^DTRKBQETALRVYSHLKSVLKDHCVQHLPDGSVTVESVL 
LQAAAPSEDPGTKVLLVSWTYQDEELGSFLTSLLKKGLPQAPS 


6216 


11 


393 


QTTRPE P RNSALRQSR S KMAWGVS S VSRLLGRS RPQLGRPMS S 
CjAHGE EGSARMW KTLTFF VALPGVAVSMIiNVYLKSHHGEHERPE 
FIAYPHXiRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 
umsMti. bGltti FETTDDS LREHFEKWGTLTDCVVMRDP QTKRSRG 
FGFVTYSCVEEVDAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 
HLTVKKIFVGGIKEDTEEYNLRDYFEKYGKIETIEVMEDRQSGK 
KRGPAFVTFDDHDTVDKIWQKYHTINGHNCEVKKALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGSYGGGDGGYNGFGGDGGNYGGGPGYSSRGGYGGGGPGYG 
NQGGG YGGGGGYDG YNEGGNFG GGN YGGGGNYND FGN YSG QQQS 


6218 


13 05 


906 


S CERRGFIMADDLXRFLYKXLPSVEGLHAIWSDRDGVPVIKVA 
NDNAPEHALRPGFLSTFALATDQG3KLGLSKNKSI I CYYNTYQV 
VQFNRLPLVVSF IAS SS ANTGL I VSLEKELAPLFEELRQVVEVS 


6219 


2 


B90 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPBADRPHQRPFL 

VLTAEQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
TYDFVTHSRLPETTVVYPADVVLFEGILVFYSQEIRDMFHLRLF 
VDTDSDVRLSRRVLRDVRRGRDLEQILTQYTTFVKPAFEEFCLP 
TKKYADVI I PRGVDNM VAINL I VQHIQDILNGDI CKWHRGQSNG 
RSYKRTFSEPGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


EQNIS LEMSCTI EKALADAKALVERLRDHDDAAESL I EQTTALN" " 
KRVEAMKQYQEEIQEijNEVARHRPRSTLVMGIQQENRQIRELQQ 
ENKELRTSLEEHQSALELIMSKYREQMFRLLMASKKDDPGIIMK 
LKSQH3 K I DM VHRNKS EGFFLDASRHI LEAPQHGLERRHLEAK Q 
NVH 


6221 


98 


916 


RWIWDLNPVSDGLELRPKYNGILHCDTTIWKLDGLRdLYQGVTP 
NIWGAGLSWGLYFVFYNAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLC I TN PLWVTKTRLMLQYDAWNS PHRQ YKGMF DTLVK I YK 
YEGVRG LYKG FV PGL FGTSHG ALQFMA YELLKLKYKQHI NRLP E 
AQL STVE YI S VAALS KI FAVAATYP YQWRAR LQDQHMF Y SG V I 
D V I TKTWRKEG VGGFY KG I APNL IR VTPACC I TFWYENVSH FL 
LDLREKRK 



468 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«=Glycine, 
HeHistidine, 3>Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S =Ser ine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 


6222 


2 


2116 


MARELRALLLWGRRLRPLLRAPALAAVPGGKPIIiCPRHTTAQLG 
PRRNPAWSLQAGRLFSTQTAEDKEEPIjHS I ISSTESVQGSTSKH 
EFQABTKKLLDIVARSLYSEKEVFIRELISNASDALEKLRHKLV 
SDGQALPEMEIHLQTNAEKGTITIQDTGIGMTQEELVSNLGTIA 
RSGSKAPLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVYSR 
S AAPGS LG YQWLSDGSGVFE IAEASGVRTGTKI I IHLKSDCKEF 
SSFARVRDWTKYSNFVSFPLYIjNGRRMOTLQAIWMMDPKDVRE 
WQHEE FYR YVAQAHDKPRYTLHYKTDAPLNIRS I FYVPDMKPSM 
FDVSRELGSSVALYSRKVLIQTKATDILPKWLRFIRGWDSEDI 
PLNLSRELLQESALIRKLRDVliQQRLIKFFIDQSKKDAEKYAKF 
FEDYGLFMR3GIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
EYASRMRAGTRNIYYLCAPNRHLAEHSPYYEAMKKKDTEVLFCF 
ay r yu.ii i JjIjH l>RE YDK KKL I S VE TD I WDH YKEE KFEDRSPAAE 
CLSEKETEELMAWMRNVLGSRVTNVKVTLRLDTHPA>fVTVLEMG 
AARHFLRMQQLAKTQEERAQLLQPTLE INPRHALIKKLNQLRAS 
EPGIAQLLVDQIYENAMIAAGLVDDPRAMVGRbNEIiLVKALERH 


6223 


3 


715 


DAWARTMAGMVDFQDEE QVKS FLENME VE CN YHC YHE KD PDGCY 
RLVDYLEGIRKNFDEAAKVLKFNCEENQHSDSCYKLGAYYVTGK 
GGLTQDLKAAARCFLMACEKPGKKSIAACHWGLLAHDGQVNBD 
GQ PDLGKARD YYTRACDGGYTS S C FNLS AMFLQGAPG PP KDMDL 
ACK Y SM KACDLGH I WACANAS RM YKLGIXSVDKVEAKAEVLKNRA 
QQVHKEQQKGVQPLTFG 


6224 


1 


133 


LRTI SSMAWGPLLLTLLAHCTGS WAQS VLTQP PS VSGAR I PHEK " 


6225 


3259 


938 


LLS CHRLAI CKIjPFS VESRKTVMGPQGARRQAFLAFGDVTVDFT 
OKE WRIXS PAQRAL YRE VTLENYSHLVS LGI LHSKPEL IR RL EQ 
GE V P WGE ERRRRPGPCAG I YAEHVLR PKNLGLAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTEIDKVLKGIENSRWGAFKCAERGQDFSRKMMVIIH 
KKAHSRQKLFTCRECHQGFRDESALLLHQNTHTGEKSYVCSVCG 
RGFSLKANLLRHQRTHSGEKPFLCKVCGRGYTSKSYLTVHERTH 
TGE KPYECQECX3RR FNDKSS YNKHLKAHSGEKP FVCKECGRGYT 
NKS YF WHKR I HSGE KP YRCQECGRG FSNKSH L I THQRTHSGEK 
P FACRQCKQS FS VKGS LLRHQRTHSGEKPFVC KDCERS FSQKST 
LVYHQRTHSGEKPFVCRECGQGFIQKSXLVKHQITHSEEKPFVC 
KDCGRGFIQKSTFTLHQRTHSEEKPYGCRECGRRFRDKSSYNKH 
LRAHLGEKRFFCRDCGRGFrLKPNLTIHQRTHSGEKPFMCKQCE 
KSFSLKANLLRHQWTHSGERPFNCKDCGRGFILKSTLLFHQKTH 
SGBKPFICSECGQGFIWKSNLVKHQLAHSGKQPFVCKECGRGFN 
WKGNLLTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWRIHSKE 
KP F VCQECKRG YTS KS DLTVHER I HTGERPYECQECGR KFSNKS 
YYSKHLKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKV^ELLGGSQRtFFLPLWRRtCR^GtGPRVSPMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


£227 


2581 


890 


MSASSLIiEQRPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP 
NNAYTAMSDS YLPS YYS PS IG FSYSLGE AAWSTGGDTAMP YIiTS 
YGQLSNGEPKFLPDAMFGQPGALGSTPFLGQHGFN?FPSGIDFS 
AWGNNSSQGQSTQSSGYSSNYAYAPSSLGGAM1DGQSAFANETL 
NKAPGMNTIDQGMAALKLGSTEVASNVPKWGSAVGSGSITSNI 
VASNSLPPAT1APPKPASWADIASKPAKQQPKLKTKNGIAGSSL 
PPPPIKHNMDIGTWDNKGPVAKAPSQALVQNIGQPTQGSPQPVG 
QQANNSPPVAQASVGQQTQPLPPPPPQPAQLSVQQQAAQPTRWV 
APRNRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRSIN 
NYNP KD FOWNLKHGRVF 1 1 KS YSEDD I HRS I KYN I WCS TE HGNK 
RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVABMKSAVDYNTCAG 
VWSQDKWKGRFDVR W I FVKD VPNSQLRH I RLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTSIFDDFSHYEKRQ j 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G»Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionlne, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6228 


47 


1978 


GRRCRRRGA VMELAQ EARELG CWAVEEMGVP VAARA P ESTLRRIi 
CLGCX3ADI WAYILQHVHSQRTVKKIRGNLLW YGHQDS PQVRR KL 
ELEAAVTRLRAEIQELDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAGAMRRQQHTLRD PMQRLQNQLRRLQDMERKAKV 
DVTFGSLTSAALGLEPWLRDVRTACTLRAQPLQNLLLPQAKRG 
S LPTPHDDHFGTS YQQ WLS S VETLLTNH P PGHVLAALEHltAAER 
EAEIRSLCSGDGLGDTEISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTIJjKERQVLTQRLQGLVEEVERRVLGSSEROVL 
ILGLRRCCLWTELKALHDOSQELQDAAGHRQLLLRELQAKQQRI 
LH WRQLVEETQEQ VRLLI KGNSAS KTRLCRS PGE VLALVQRKW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPSIKQLHPASPRGSSFIALSHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 

qkenlgqalkrlekllkqaleripelqgivgdwweqpgqaalse 

EbCQGLSLPQWRIiRWVQAQGAIiQKLCS 


6229 


1571 


560 


gpsllgtrgtpnpartlqiffliigrrltgrmaavddlqe?eefg 
naatsltanpdattvniedpgetpkhqpgsprgsgreeddellg 
nddsdktellagqkksspfwtfeyyqtffdvdtyqvfdrikgsl 
lpipgknfvrlyirsnpdlygpfwicatlvfaiaisgnlsnfli 
hlgektyhyvpefrkvs i aati i yayawlvplalwoflmwrnsk 

VMNI VS YS FliE I VCVYG YSLFIYI PTAI&W 1 1 PHKAVR WI LVM I 
ALG I SGSIjLAMTFWPAVREDNRRVALATI VTI VLLHMLLS VGCL 
AYFFDAPEMDHLPTTTATPNQTVAAAKSS 


6230 


1723 


600 


SKMSGRSGKKKMSKLSRSARAGVIFPVGRLMRYLikKGTFKYRIS" 
VGAP VYMAAVI E YLAAE I LELAGNAARDNKKAR I APRH I LLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAE IDLKED I GKALE KAGGKEFLE T VKELRKSQGP LE VAEAAV 
S QS SGLAAKFVI H CH I P QWGSDKCE EQLEET I KNCLS AAE DKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSES IGI YVQEMAKLDAK 


6231 


149 


870 


iilFSSSTMDRSLRNVLVVSFfiFLLLFTAYGGLQSLQSSLYSEEG 
LGVTALSTLYGGMLLSSMFLPPLLI ERLGCKGTI ILSMCGYVAF 
SVGNFFASWYTLIPTSILX^LGAAPLWSAQCTYLTITGNTHAEK 
AG KRG KDMVNQ YFGI FFL I FQS SGVWGNLI S S LVFGQTPSQETL 
PEEQLTS CGAS D CLMATTTTNSTQR PSQQLVTTLhG I YTGSG VL 
AVLMI AAFLQP IRDVQRESE 


6232 
6233 


3679 
1 


1476 
2654 


^VAGTTMAGFWVGTAPLVAAtiRRGRWPPCXJtMLSAALRTiKHVL 
YYS RQCLMVSRNLG S VG YDPNEKT FDK I LVANRGE I ACR V I RT C 
KKMGI KTVAIHSDVDASSVHVKMADEAVCVGPAPTSKS YLNMDA 
IMEAI KKTRAQAVHPGYGFLSENKEFARCLAAEDWFIGPDTHA 
IQAMGDKIESKLLAKKAEVNTIPGFDGVVKDAEEAVRIAREIGY 
PVMIKASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKFIDNPRHIEIQVLGDKHGNALWLNERECSIQRRNQKWEE 
APS I FLDAETRRAMG EQA VALARAVKYS SAGTVEFL VDS KKNFY 
FLEMNTRLQVEHPVTECITGLDLVQEMIRVAKGYPLRHKQADIR 
INGWAVECRVYAEDP YKS FGLPS IGRLSQYQEPLHLPGVRVDSG 
I QPGS D IS! YYDPM IS KLIT YGSDRTE ALKRMADAIiDNYVI RGV 
THNIALLRBVI INS R PVKGDI STK FLS DVYPDGFKGHMLTKS EK 
NQLLAIASSIjFVAFQLRAQHFQENSRMPVIKPDIANWELSVKLH 
DKVHTWASNNGSVFSVEVDGSKLNVTSTWNLASPLLSVSVDGT 

qrtvqclsreaggnmsiqflgxvykvniltrlaaelmkfmlekv 

TEDTSS VLRSPMPG VWAVS VKPGDAVAEGQEICVI EAMKMQNS 

MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 

HSTRENLNAGNFNFPSEGHLVRSTGPGGSFAKHMVAQCVSPKGP 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M«=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine f 
S=Serine, TaThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRTYFFGATHVPYIX3GDSKLPKKTEQIRLLSQIYAAVIEAV 
LAG I AC YAKTSS LTKAKEVAEQTLGSGLDS FEL I PFKAALRSKM 
TFH IHAVNNQGRI VPIiDSEDS LS FVKTACMAVYDI PDLLGGNGC 
LGS WFSES FLTSQ I LVKEKDGTVTTETSS WLTAAVPRFCS WL 
VEDNEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLYSSNLQSWP 
EEGNVHFFS SGLLFSHCRHGSI IIS KDHMNS 1 S FYDGDSTS TVA 
ALLIDFKSSLLPHLPVHFHGSSNFLMIALFPKSKIYQAFYSEVF 
SLWKQQDNSGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPELDWFLQHFAISSISQEPVMRTHLPVLLQQ 
AE I NTTHR I ESDKVI I S I VTG LPGCHAS ELCAFLVTLHKECGRW 
M VYRQ I MDS S ECFHAAH FQRYLS S ALEAQQNRS ARQS AY I RKKT 
RLLWLQGYTDVIDWQALQTHPDSNVKASFTIGAITACVEPMS 
CYMBHRFLFPKCLDQCSQGLVSNWFTSHTTEQRHPLLVQLQSL 
IRAANPAAAFILAENGIVTRNEDIELILSENSFSSPEMLRSRYL 
MYPGW YEGKLNAGS VYPLMVQ I CVWFGRPLEKTRF VAXCKAIQS 
SIKPSPFSGNIYHILGKVKFSDSERTMEVCYNTLANSLSIMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWLRQSA 
KQKPQRKALKTRGMLTQQE 1 RS I HVKR HI^E PLPAG YFYNGTQFV 
NFFGDKTDFHPLMDQFMNDYVEEANREIEKYNQELEQQEYHDLF 
ELKP 


6234 


1731 


404 


PRVREDMDHKSPGNKGSLVYAGI KS I VKSS LGMVESSRHNWSGL 
DKQSDIQNLNEERILALQLCGWIKKG'XDVDVGPFLNSLVQEGBW 
ERAAAVALFNLDIRRAIQILNEGASSEKGDI^U^AMALSGYT 
DEKNSLWREMCSTLRLQLNNPYLCVMFAFLTSETGSYDGVLYEN 
KVAVRDRVAFACKFIiSDTQLNRY IBKLTNEMKEAGNLEGI LLTG 
LTKDG VD LME S YV DRTGD VQTAS YCMT «QGS PLDVLKDERVQ YW I 
ENYRNLL DAWRFWHKRAEFD I HRSKLDP SS KPLAQ VF VS CNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKLAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNIiVPAETV 
QP 


6235 


1 


571 


EKRDHRLPS W PRAALKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
DYESDDDSYEVLDLTEYARRHQWWNRVFGHSSGPMVEKYSVATQ 
I VM GG VTG V7 CAGFLFQ KVG KLAATAVGGG FLLLQIASHSGYVQI 
D WKRVE KD VNKAKRQ I KKRANKAAPEINNLIEEATBFI KQNIVI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKE LSRS AKKCDKE EKAE KAKI KKAI QKGNMEVAR IHAE 
NAI R Q KNQ AVN FLRMS AR VDAVAAR VQTAVTMG KVT KS MAG W K 
SMDATLKTMNLEKISALMDKFBHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVMDVNTALQEVLKTALIHDGLARGIREAAKA 
LDKRQAHLCVLASNCDEPM YVKL VE AIjCABHQ I NLI KVDDNKKL 
GEWGLCKIDREGKPRKVVGCSCVWKDYGKESQAKDVIEEYFK 
CKK 


6238 


2 


U66 


EBVPTQESVKWE INVI IKNPEIVFVADWTKNDAPALVITTQCEI 
CYKGNLENSTMTAAIKDLQVRACPFLPVKRKGKITTVLQPCDLF 
YQTTQKGTDPQVI DMS VKS LTLKVS? VI INTMI TITSALYTTKE 
TIPEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEM IKMNIDS I FI VLEAG IGHRT VPMULAKSR FSGEGKin'JSSL 
INLHCQLELEVHY YNEMFG VWEPLLE PLE I DQTED FRP WNLG I K 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
L VM LNNLVKAFT2AATG S S ADFV KDLAJ? FK I LNSLGLT I SVS PS 
DS FSVLNI PMAKS YVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFILLTPVNHSTADKIPLTKVGRRiYTVRHRESGVERSIVCQI 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A°Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glut amine, RsaArginine, 
S=Serine, TsThreonine , V=Valine, 
W=Tryptophan, Y«Tyrosine, X»Un)tnown, *«Stop 
Codon, /opossible nucleotide deletion, 
\«possible nucleotide insertion) 








D T VEGS KKVT I RS P VQ I RNHFS VPLS VYEGDTLLGTAS PB NE FN 
I PLGSYRSF1 FLKPEDENYQMCEGIDFBEI I KNDGALLKKKCRS 
KNPSKESFLIN I VPE KDNLTS LS VYS EDG WDL PYI MHLWP PILL 
RNLL PYK I AYY IEG I ENS VFTLSEGHS AQ I CT AQLG KARLHLKL 
LDYLNHDWKSEYH I KPNQQDI S FVS FTCVTEMEKTDLD I AVHMT 
YNTGQT WAFHS P YWM VNKTGRMLQYKADG I HRKHP PN YKKP VL 
FS FQPNHFFNNNKVQLMVTDSELSNQFS I DTVGSHGAVKCKGLK 
MDYQVGVTIDLSSFNITRIVTFTPFYMIKNKSKYHISVAEEGND 
KWLS LDLEQC I PFW P E YAS SKLL I QVERS ED PP KR I Y FN KQENC 
I LLRLDNELGG 1 1 AE VNLAEHSTV I TFLD YHDGAATFLLINHTK 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES BKAELAEQEI AVALQDVG I S LVNNYTKQE VAY 2GITSSD 
WWETKPKKKARMKPMSVKHTEKLEREFKBYTESSPSEDKVIQL 
DTNVP VR LT PTGHNM KI LQ PHVI ALRRNYLPAL KVE YNTS AHQS 
SFRIQIYR2QI QNQ I HG AVFP F VF Y P VKP P KS VTMDS APKPFTD 
VSIVMRSAGHSQISR1KYFKVLIQEMDLRLDLGFIYALTDLMTE 
AEVTENTEVELFHKDIEAFKEEYKTASLVDQSQVSLYEYFHISP 
I KLHLS VS LS S GREE AKDS KQNGG L I PVHS LNLLLKS IGATLTD 
VQDWFKLAFFELNYQFHTTSDLQSEVIRKYSKQAIKQMYVLIL 
GLDVLGNPFGL I REFSEGVEAF FYE PYQGAI QGPE EFVEGMALG 
LKALVGGAVGGLACAAS KI TGAMAKGVAAMTMDEDYQQ KRREAM 
NKQPAG FREG I TRGGKGLVS GF VSG I TG I VTKP I KGAQ KGGAAG 
FF KGVGKGLVGAV AR PTGG I IDMASSTFQGI KRATETS EVESLR 

p p r ffnedgvi r p yrlrdgtgnqmlq kiqfyre wi mths s 5 sdd 
ddddddddesdl:vh 


6239 


2108 


634 


KPGMAGKGSSGRRPLLLGLLVAVXTVHLViCPYTKVEBSFNLQA 
TKDLLYHWQDLEQYDHLE FPGVVPRTFLGP WIAVFSS PAVYVL 
SL L EMS KFYSQL I VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM j 
FCWVTAMQFHLMFYCTRTIjPNVLALPVVLIJaAAWIJiKEWARFI 
WLSAFAI I VFRV3LCLFLGLLLLLALGNRKVS WRALRHAVPAG 
ILCLGLTVA^SYFWRQLTWPEGKVLWYNTVLNKSSNWGTSPLL 
WYFYSALPRGLGCSLLFIPLGLVDRRTHAPTVLALGFMALYSLL 
PHKELRFI I YAFPMLNI TAARGCS YLLNNYKKSWLYKAGSLLVI 
GHLWNAAYS ATAL YVSH FNYPGG VAMQRLHQLVP PQTDVLLHI 
DVAAAQTGVSRFLQVNSAWRYDKREDVQPGTGMliAYTHILMEAA 
PGLLALYRDTHRVLASVVGT TGVS LNLTQLPPFNVHLQTKLVLL 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKBPTSIAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGF3LGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDFSSVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHR2PSPVRYDNLSRHIVASLQEREKL 
LRQS P PL PORE EE PGLGDSGIQSTPGSGHAPRTS SSSDDS KRS P 
LGKTPLGRPAVPRFGKPDGLRGRGVGS PEPG PTAP YLGRSMSYS 
9QKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLGS 
ASPGPGOPPLSSPTRGGVKKVSGVGGTTYRTfiV 


6241 


3 


1341 


RNAEEKKRLSLQREKI IARVS I DN RTRALVQALRRTTDP KLC I T 
RVEELTFHLLEFPEGKGVAVKERI I PYLLRLRQI KDETLQAAVR 
EILAL IGYVDPVKGRGIR ILS IDGGGTRGWALQTLRKLVEIjTQ 
KPVHQL FDYICG VS TG A I LA FMLGLFHMPLDECEEL YRKLGS D V 
FSQNVIVGTVKMSWSHAFYDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVl^GITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQAI RAS S AAPG YFAE YALGNPLHQDGGIJjLNNPSALAMHECKC 
LWPD V P LECIVS LGTGRYBS DVRNT VTY TS LKTKLSNVINS ATD 
TEEVHIMLDGLLPPDTYFRFNPVMCENIPLDESRNEKLDQLQLE 
GLKYIERNEQKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 



472 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
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.kuii iu dcia ueymenc containing signal peptide 
<A= Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H-Histidine, I=»Ieoleucine, K=>Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /opossible nucleotide deletion, 

\sDOSSible nucleotide incftr*h^rml 








PFSKL — — 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVEMGE 
S S ED I DQM FS TLLGEM DLLTQS LG VDTL PPPD PNP PRAEFNYS V 
G FKDLNESLNALE DQD LDALMADLVADI S EAEQRTIQ AQKESLQ 
NQHHS ASLQAS I FSG AAS LGYGTNVAATG I SQY SDDL P P P PAD P 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKLALEKLKEAKVKFOiV 

VKVtfMWnWQTVCT.MX/nPDOT ftOMn TWIT DDVmunnMtn nuTM 

IYPELQIERFFEDHENVVEVLSDWTRDTENKILFLEKEEKYAVF 
KNPQNFYLDNRGKKESKETNE^INAKNKESLLEVRLILQSGRKE 


6243 


1503 


614 . 


RSASRFSGCWSRDSTCCtiCPSTCWSRSSASCPRARWPPS^APAT 
TS RAS S RR LACG PQTRAGAETRSTAM I RANS AARDTRRATCRSA 
AGTPSPTTMTCLTDVPTGCAAVEPTARLPAAAWASTITTGCCPA 
MGQAGAGPAGRKGSEAGGG PGRAHHAHPSPLPREPRVRTG ? PAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPLK 
GPPILAPILSLTPILSRWSCYFPRSR1AQGWHLS 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVI WPS VAPQSLP LWEGI FLRWN RS S KYLDEAYE 
EMVNI I EYNKELQAKVNILRRQLAELETEDGMQES P 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
ICISLAFWIISMTASTYYGNLRPISPWRWLFSWVPVLIVSNGL 
KKKSLDHSGAIiGGLWGFlLTIANFSFFTSLLMFFLSSSKLTKW 
KGEVKKRLDSEYKEGGQRNWVQVFCNGAVPTELALLYKIENGPG * 
EI P VDFS KQYS AS WMCLS LLAALACS AGDTW A S E VGP VLSKS S P i 
RJjI TTWEKV P VGTNGGVTWGLVS S LLGGTFVGI AYFLTQLI FV 
NDLD IS ApQWP 1 I AFGGLAGLLGS I VDS YLGATMQYTGLDESTG 
MWNS PTNKARHI AGKPILDNNAVNL FS SVLIALLLPTAAWGFW 
PRG 


624g 


1177 


3S9 


SLWPWI LMDDSIMQISLQLLCVYTANFPNGCSSLCWSSCGQHPV ~ 
QATHRGAVSNS LM LCI L KLAS QM PL ENTT VQQMVFMLLSNLALS 
HDCKGVIQKSNFLQNFLSLALPKGGNKHLSNLTILWLKLLLNIS 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
PANKPKI LANE KV I TVLAACLESENQNAQRIGAAALWALI YNYQ 
KAKTALKS PS VKRR VDEAYS LAKKTFPNSEANPLNA YYLKCLEN 
LVQLLNSS 


6247 


3 


1678 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGLVPLTDDTSHAGP 
PGPGRALLECDHLR SG VPGGRRRXDWS CS LLVASLAGAFGS S FL 
YGYNLS VVNAPT P Y I KAF YNES WERRHGRP I DP DTLTLLWS VTV 
SIFAIGGLVGTLIVK>1IGKVLGRKHTLLANNGFAISAALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
OVTAI FI CIGVFTGQLLGLPELLGKBST WPYLFG VI WPA WQL 
LSLPFLPDS PRYLLLEKHNEARAVKAFQTFLGKAHVSQEVBE VL 
AESRVQRSIRLVSVLELLRAPYVRWQWTVIVTMACYQLCX5LNA 
I WFYTNS I FGKAG I P PAK1 P YVTLSTGG I ETLAAVFSGLVI EHL 
GRRPLLIGG FGLMGLFFGTLTI TLTLQDHAPWVPYLS I VG I LAI 
IAS FCSGPGGI P FI LTGEFFQQSQRPAAFI I AGTVNWLSNFAVG 
LL FP P I QKSLDT YCFL VFAT I C I TGA I YLYF VLPETKNRTYAE I 
SQAFSKRNKAYPPEEKI DSAVTDGKINGRP 


6248 — 




1773 


VPPPRMMAAVPPGLE PWNRVR I PKAGNRS AVTVQNPGAALDLCl" 
AAVIKECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGS I QDLFELFSSNENQPLTTKVCWP 
SQPVVELVLMKVLGACKLLLRIiLDCCCKTFLLTVKHIjGLQEFI I 

lnlvmvglvs rlwvlykgvlkr l i ll ye plfgllqe variq pm p 
yfkdctfpsditeflgqpyfeafkkkmpiafaakginkllnxlf 

LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
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to first 
amino acid 
residue of 
amino acid 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, Iolsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X«UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








KE E 5 S B FD VRAFCNQLKHKATQ ETS FDFKCS QSRL KTTKYSS Q K 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 
FLGNKLLKSNRI,KHLEAQGTSLPKECLECIKTSICNHLLRGSGIK 
TS KHHLRQRRS QNKFLRRQR KPQR KLQSTLLRE IQQ FSQGTR KS 
ATDTSAKWRLSHCTVHRTDLY PNSKQLLNSGVSMPVIQTKEKM I 
HENLRGIHENETDSWTVMQINKN3TSGTIKETDDIDDIFALMGV 


6249 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVlLSLKSQTLDABTDVIjCAVLySNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCWP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 
LNLVMVGLVSRLWVLY KGVLKRL I LLYBPLFGLLQEYAR IQPM P 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
L INEQSPRASEETLLGI S KKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
V I GT PHAKS FVQR FRE AE S FTQLS EE IQMAWWCRS KKLKAQA I 
FLGNKLLKSNR LKHL E AQGTSLP KKLE CI KTS I CNHLLRGSG I K 
TS KHHLRQRRS QNKF LRRQR KPQR KLQSTLLRE I QQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLLNSGVSMPVIQTKEKMI 
HENLRGIHENETDS WTVMQINKNSTSGT1 KETDDIDD I PALMGV 


*250 


232 


1306 


IAALHIMALPFRKDLEKYKDLDEDELLGNLSETELKQLETVLDD 
LDPENAIiLPAGFRQKNQTS KSTTG PFDREHLLS YLEKEALEHKD 
REDYVPYTGEKKGKI FI PKQKPVQTFTEEKVSLDPELEEALTSA 
S DTELCDLAAILGMHNL I TNTKF CN I MGSSNG VDQEHFSNWKG 
EKILPVFDEPPNPTNVEESLKRTKENDAHLVEVNLNNIKNIPIP 
TLKDFAKALETNTHVKCFSLAATRSNDPVATAFAEMLKVNKTLK 
SLNVESNF ITGVGILAL I DALRDNETLAELKIDNQRQQLGTAVE 
LE MAKMLEBNTNI L KFG YQFTQQG PRTRAANA ITKNNDLVRKRR 
VEGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLH FTAAV PAGHNKWS KVRH I KGPKDVERSRI FS 
KLCLNIRLAVKEGGPNPEHNSNLANILEVCRSKHMPKSTIETAL 
KMEKSKDTYLLYEGRGPGGSSLLIEALSNSSHKCQADIRHILNK 
NGGVMAVGARHSFDKKGVIWEVEDREKKAVNLERALEMAIEAG 
AEDVKETEDEEERNVFKF ICDASSLHQVRKKLDS LGLCS VS CAL 
EFI PNS KVQLAEPDLEQAAHLIQALSNHEDVIHVYDNIE 


6252 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 

ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILXKSRISKKPQV 

PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 

KLPHSKAKTRSRLEVAEAEEEETS IKAARSELLLAEEPGFLEGE 

DGEDTAKI CQADI VEAVD I ASAAKHFDLNLRQFGP YRLNYSRTG 

RHtiAFGGRRGHVAALDWVTKKLMCBINVMEAVRDIRFLHSEALL 

AVAQNRWLH I YDNQG I ELHC I RRCDRVTRLEFLPFHFLLATAS E 

TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 

TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 

KI FDLRGT YQPL S TRTLPHGAGHLAFS Q RGLL VAGMGD WNI WA 

GQG KAS PPS LEQ P YLTHRLS G P VHGLQ FCPFEDVLGVGHTGG I T 

&MijVi?eAeEPNFDGLESNPYRSRKQRQEWEVKALLEKV 

LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 

SSTASLVKRKRKVMDEEHRDKWQSLQQQHHKEAKAKPTGARPS 

ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRXKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYXLKKSRISKKPQV 
PKKPREWKJJPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KL PHS KAKTRSR LE VAEAEEEETS I KAARS E LLLAE E PGFLEGE 
DGEiyTAK I CQADI VEAVD I ASAAKHFDLNLRQFGP YRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCEINVMEAVRDIRFLHSEALL 
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Amino acid segment containing signal peptide 
<A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S»Serine, T=» Threonine, V= Valine, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«pos8ible nucleotide insertion) 








AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASE 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TV^LWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
K I FDLRGT YQ PLSTRTLPHGAGHLAFSQRG LLVAGMG DWN I WA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAXAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALG RRGG S Q E LS AAACG CFAL RLRAPGSGRPALA PGAAA FAGL 
GGA PRFP PRGS AAGRTMLLKE YRI CMPLTVDE YKI GQL YM I S KH 
SHEQSDRGEGVE WQNEPFEDPHHGNGQFTEKRVYLNS KLP SWA 
RAWPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIHIETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYKLVTVKPEVWGLQTRVEQFV 
HKVVRDILLIGHRO^FAWVDEWYDMTMDDVREYEKNMHEQTNIK 
VCNQH SS P VDD I ESHAQTST 


6255 


1 


! 1444 


PTRPQQELLVS LATVI FVASQKALS VESKAVI KQQLESVSNG WT ~ 
VYR IARQASRMGNHDMAKELYQSLLTQVASKHF YFWLNSLKE FS 
HAEQ CLTGLQE ENYS S ALSC I AES LKFYHKGI AS LTAAST PLNP 
I^FQOTFVKLRIDLLQAFSQLICTCNSLKTSPPPAIATTIAMTL 
GNDLQRCGRISNQMKQSMEEFRSLASRYGDLYQASFDADSATLR 
NVELQQQSCLLISHAIEALILDPESASFQEYGSTGTAHADSEYE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIALLJCVPL 
S FQR Y F FQ KLQSTS I KLALS PS PRN PAE PIAVQNNQQLALKVEG 
WQHGSKPGLFRKIQS VCLNVSSTLQSKSGQDYKI PIDNMTNEM 
EQRVEPHND Y FSTQFLLNFAI LGTHN I TVES S VKDANG I VWKTG 
PRTTIFVKSLEDPYSQQIRLQQQQAQQPLQQQQQRNAYTRF 


6256 


1 


1542 


crgagaepaanprsprslvpsleststsvppapgtmatdswala 
vdeqeaaaes lsnlhlkeeki kpdtngawktnanaektdeeek 
edraaqsllnkli rsnlvdntnqvevlqrdpns plysvksfeel 
rlkpqllqcJvyamgfnrps kiqenalplmlaeppqnliaosqsg 
tgktaafvlamlsqvepankypqclclsptyelalqtgkvieqm 
gkfypelklayavrgnklergqkiseqivigtpgtvldwcsklk 
fidpkkikvfvldeadvmiatqghqdqsiriqrmlprncqmllf 
satfedsvwkfaqkwpdpnviklkreeetldtikqyyvlcssr 

DEKFQALCNLYGAI T I AQAM I FCHTR KTAS WLAAELS KEGHQ VA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTNVCARG IDVEQVSVV 
INFDLPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDSKHSM 
NI LNR I QEHFN KK I ERLDTDDLDE I EKI AN 


*257 


210 


615 


AF I PAMAE L IQKKLQGE VE KYQQLQKDLS KSMSGRQKLEAQLTE 
NNIVKEELALLDGSNWFKLLGPVLVKQELGEARATVGKRLDYI 
TAEI KRYES QLRDLERQS EQQRETLAQLQQE FQRAQAAXAG APG 
KA 


6258 


210 


615 


AF I PAMA E LI QKKLQGEVE KYQQLQKDLS KSMSGRQ KLEAQLTE " 

NN I VKE E LALLDGS NVVFKLLGP VL VKQELGEARATVGKRLD Y I 

TAEIKRYESQLRDLERQSBQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6259 " 


2 


1540 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNNKTV" 
S VENGDRGSKTFNLGTDPVSLRNYP YKI CDSCEMNLKNISGLI I 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPS FGQS FEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTFIESLKLNISQRPHLEMEPYGCSICGKSFCMNLRFGH 
QRALTKDNPYEYNEYGEIFCDNSAFIIHQGAYTRKILREYKVSD 
KTWEKSALL KHQ I VHMGGKS YD YN ENGSNFS KKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ | 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=»Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGEKPYECKQCGKTFCVKSNLTEHQRTHTGEKP 
YE CNACG KS FCHRS ALT VHQRTHTGE KP P I CNE CGKS F C VKS NL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRSVLTKHQRIHTRVKALSTS 


6260 


2081 


1436 


GTGPE IHACAHAS ARAPGSRAMALREIiKVCLLGDTGVGKSS I VW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMYYRGSAAAIIVYDITKEETFSTLKNWVKELRQHGPPNI 
WAX AGNKCDL ID VR E VMERDAKD YADS IHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR 
SPACEPCRPDFAPRPALLLRSGPRSAPAVTGKPALKGQPGPWPG 
MAE VS I DQS KliPGVKE VCRDFAVLE DHTIiAHS LQEQE I EHH LAS 
NVQRNRLVQHDLQVAKQLQBEDLKAQAQLQKRYKDLEQQDCBIA 
QEIQEKLAIEAERRRIQEKKDEDIARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
E1ARKLQEEELLATQVDMRAAQVAQDEEIARLLMAEEKKAYKKA 
KEREKSSLDXRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RP P P P IMTDGEDAD YTHFTNQQSSTRHFSKSES SHKGFH YKH 


6262 


2 


1759 


PE CH SQGLCS VHRPGKVPQARMS GI» VLGQRD E PAGHR LS Q K E 1 1* 
GSTRLVSQGLEALRS EHQAVIiQSItSQTI ECLQQGGHEEGLVHEK 
ARQLRRSMENIELGLSEAQVMLALASHLSTVESEKQKLRAQVRR 
LCQENQWLRDELAGTQQRLQRSBQAVAQLEEEKKHLEFLGQLRQ 
YDEDGHTSBEKEGDATKDSLDDLFPNEEEEDPSNGI/SRGQGATA 
AQQGGYEIPARLRTIjHNLVIQYAAQGRYEVAVPLCKQALEDLER 
TSGRGH PDVATMLNI LAIiVYRDQNKYKEAAHLIjNDAIjS I RES TL 
GPDHPAVAATLNNIiAVLYGKRGKYKEAEPLCQRAIiEIREKVLGT 
NHPDVAKQLNNLALLCQNQGKYEAVERYYGRALAIYEGQLGPDN 
PNVARTKNNLASC YLKQGKYAEAETliY K E I LTRAHVQEFGS VDD 
DHKPIWMHAEEREEMSKSRHHEGGTPYAEYGGWYKACKVSSPTV 
NTTLRIII/SALYRRG^KLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


2408 


RELDS IiADLPER I KPP YANGLSTSHLRSSS VBDVKLI I S EGRPT 
IE VRRCSM PS V X CEHTKQFQT ISBE SNQGSLLT VPGDTS PS P KP 
EVFSNVPERDLSNVSNIHSSFATSPTGASNSKYySADRNL I KNT 
AP VNTVMDSPVHLE PSSQ VGVIQNKSWEMPVDRLETLSTRDFI C 
PNSNI PDQESSLQS FCNSENKVLKENADFLSLRQTELPGNSCAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 
KNAVQIISSALDTDNESTKDTENTFVLGDVQKTDAFVPVYSDST 
I QEAS PN FE KAYTL P VL P S EKD FNGS D ASTQLNTHYAFS KLT YK 
SSSGHEVENSTTDTQVISHEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNBQDPKHCPESEKCLLSIEDEESQQSILSSLENHSQ 
QSTQPEMHKYGQLVKVELEEfJAEDDKTENQIPQRMTRNKANTMA 
NQSKQILASCXLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQPVQVSPSLLQAKEKTQQSIAAIVDSLKLDEIQPYSSER 
ANPYFEYLHIRKKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRLQHSIE 
REKL I VSNEQE VLR VHYRAARTLANQTLP FS ACTVLLDAE VYNV 
PLDSQ S DDS KTS VRDR FNARQ FMS WLQDVDDKFD KL KTCLLMRQ 

QHEAAALNAVQRLEWQLKLQELDPATYKSrSIYEIQEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQE2JNALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPWWAIVGLYRTGKS YLMNKLAGKNKGFSLGSTVKSHTKG I 
HMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDSW1FTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIKKFFPKKKCFVFDLPIHRRKIiAQLEKLQDE 
ELDPEFVQQVADFCSYI FSNSKTKTLSGGI KVNGPRLESLVLTY 
INAISRGDLPCMENAVLALAQ I ENSAAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQVI FS PLEEEVKAGI YSKPGG 
YCLFIQKIiQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQ ILTE KEKE I EVECVKAESAQASAKMVEEMQI KYQQMMEEK 
EKSYQBHVXQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6265 


143 


1960 


KHRQENNALDMAP E I HMTG PMCLI ENTNGE LVANPE ALKI LS AI 
TQPWWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WM WCVPHP KKPEHTLVLLDTEGLGDVKKGDWQNDSW I FTLAVLb 
SSTLVYNSMGXINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCSYIFSNSKTKTLSGGIKVNGPRLESLVLTY 
INAISRGDLPCMENAVLALAQ I ENSAAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATE VYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEASSDRCSALLQVI FS PLEEEVKAGI YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQI LTE KE KE I EVE CVKAES AQAS AKMVEEMQI KYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


626^ 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQGIGLDGIPEVTASE 
GFTVNEINKKSIHISCPKENASSKFIiAPYTTFSRIHTKSITCLD 
ISSRGGLGVSSSTDGTMKIWQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKI WSAEDASC WTFKGHKGGI LDTAIVDR 
GRNVVSASRDGTARLWDCGRSACLGVLADCGSS INGVAVGAADN 
SINLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRQLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGA P VLS LLS VRDGFIASQGDGS CFI VQQDLD YVTELTGAD CD 
PVYKVATWEKQIYTCCRDGLVRRYQLSDL 


6267 


3 


622 


LGMMKKNNSAKRGPQDGNQQPAPPEKVGWVRKFCGkGIFREIWK " 
NRY WLKGDQLYIS EKE VKDEKNI QE VFDLSD YEKCE ELR KS KS 
RSKKNHSKFTLAHSKQPGNTAPNLIFLAVSPEEKESWINALNSA 
ITRAKNRILDEVTVEEDSYLAHPTRDRAKIQHSRRPPTRGHLMA 
VASTSTSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


1363 


HRELCQNLPAGLSSAL IDNPLTLLL6 I DTYVMLQEP VT FQD VAV 
DFSREEWGLLGPTQRTEYRDVMLETFGHLVSVGWBTTLENKELA 
PNSDIPEEEPAPSLKVQESSRDCALSSTLEDTLQGGVQEVQDTV 
LKQMES AQEKD LPQKKH FDNR ES QANSGALDTNQ VS LQKI DNPE 
SQANSGALDTNQVLLHKI P PR KRLRKRDS QVKSM KHNSRVKI HQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
IFRNPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 
GERPFECQECGRTFNDRSAISQHLRTHTGAKPYKCQDCGKAFRQ 
SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 
KKKQPTS 


6269 


2686 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL* 
TQ YVKWTNDKS LGG I EGCLS KLKAADPTFVMGHAMATGLVLIGT 
GSSVKLDKELDLAVKTMVEISRTQPLTRREQLHVSAVETFANGN 
FPKACELWEQ I LQDHPTDMLALKFSHDAYF YLGYQEQMRDS VAR 
IYPFWTPDIPLSS YVKGI YSFGLMETNFYDQABKLAKEALS INP 
TDAWS VHTVAH I HEM KAE I KDGL EFMQHS ETLWKDSDMLACHNY 
WHWALYLIEKGEYEAALTIYDTHILPSLQANDAMLDWDSCSML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQELLTTLRDAS ES PGENCQHLLARDVGLPLCQALVEAE 
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loudtion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Aroano acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, Iolsoleucine, 10= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V^Valine, 
W^Tryptophan, Y-Tyroeine, X*Unknown, *--=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








DGNPDRVLBLLLPIRYRIVQLGGSNAQRDVFNQLLIHAALNCT'?" 
SVHKNVARSLLMERDALKPNSPLTERLIRKAATVHLMO 


6270 


23 


2086 


S VTVTLGS EGDGRP PT Y HLEEMEQE PQNGEPAE 1 KI I REAYKKA 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRG1SISSKESEHT 
GPGWESARQMQQKMKETLQNVRTRLE ILE KGLATS LQNDLQE VP 
KLYPEFPPKDMCEKLPEPQSFSSAPQHAEVNGNTSTPSAGAVAA 
PASLSLPSQSCPAEAPPAYTPQAAEGHYTVSYGTDSGEFSSVGE 
EFYRimSQPPPLETLGLDADELIblPNGVQIFFVNPAGEVSAPS 
YPGYLR I VRFLDNSLDTVLNRPPGFLQ VCD WLY PL VPDRSPVLK 
CTAGAYMFPDTMLQAAGCFVGWLSSELPEDDRELFEDLLRQMS 
DLRLQAN WNRAEEENE FQI PGRTRPSSDQLKEASGTDVKQLDQG 
NKDVRHKG KRG KRAKDTS SE E VNLSH I VP CEPVPEE K P KELPEW 
S E KVAHNI LSG AS WVS WGLVKGAE I TGKA I QKGAS KL RER I QP E 
E KPVEVS PAVTKGLYI AKQATGGAAKVSQFLVDGVCTVANCVGK 
E LAPHVKKHGS KLVP E S LKKDKDG KS PLDGAMWAA S S VQG FST 
VWQGLECAAKCIVNNVSAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYNINNIG I KAMVKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAANVNVRGE KDE QTKE VKEAKK KDK 


6271 




1056 


GCGVKTAGMVGREKELS I HFVrG S CRLVEEE VNI PNRRVLVTGA 
TGLLGRAVHKE FQQNNWHAVGCGFRRARPKFE QVNLLD S NAVHH 
IIIIDF^PHVIVHCy^ERRPDWENQPDAASQLNVDASGNLAKEA 
AAVGAFL I Y I S SDYVFDGTNPP YREED I PAPLNLjYGKTKLDGEK 
AVLENNLGAAVLRIP I LYGE VEKLEESAVTVM FDKVQ FSNKSAN 
MDHWQQRFPTHVKDVATVCRQLAEKRMLDPSIKGTFHWSGNEQM 
TKYEMACAI ADAFNLPS SHLRP ITDS PVLGAQRPRNAQLDCSKL 
ETLGIGQRTPFR1GIKESLWPFLIDKRWRQTVFH 


6272 


113 6 


528 


GAVMEDAAAPGRTEGVLERQGAPPAAGQGGALVELTPTPGGLAL 
VS P YHTHRAGD PLDLVALAEQVQ KADE FIRANATNKLTVI AEQ I 
QHLQEQARKVLEDAHRDANLHHVACWIVKKPGNIYYLYKRESGQ 
QYFS 1 1 S P KE WGTS CPHDFLGA YKLQHDLS WTP YE DIEKQDAK I 
S MMDTLLS QS VALPPCTE PNFQGLTH 


6273 


256 


843 


SCPRVSPECRS lgcqvmfs lplncspdhirrgs cwgrpqdlkia 

SAAWNS KCHPGAG AAMARQHARTLW YDR PR YVFM E FCVEDS TD V 
HVLIEDHRIVFSCKNADGVELYNEIEFYAKVNSKDSQDKRSSRS 
ITCFVRKWKEKVAWPRLTKEDIKPVWLSVDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 
6275' '■" 


56 


1142 


AAAAMAAAAGGGAGAARS LS R FRGCLAGALLGDCVGS F YEAHDT 
VDLTSVLRHVQSLEPDPGTPGSERTEALY'YTDDTAMARALVQSL 
LAKEAFDEVTJMAHRFAQEYKIOJPDRGYGAGWTVFKKLLNPKCR 
DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHAS SIiGYNGAI LQALAVHLALQGES SS KHFLKQLLGHMED 
LEGDAQSVLDARELGMEERPYSSRLKKIGELLDQASVTREEVVS 
ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSLQRTLIYSI 
SI/3GDTDT1 ATMAGAIAGAYYGMDQVPES WQQSCEGYEETD I LA 
QSLHRVFQKS 


" 6276 " 


20 


565 


SRRGRARCLAKUSRRPVPRPAKTMAFMVKTMVGGQLKNLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 
KAERATLRS HFRD KYRLPKNE TDESQ IQMAGGDVE L PR ELAKM I 

EEDTEEEEEKASVLGQLASLPGLNLGSLKDKAQATLGDLKQSAE 
KCHVM 




797 


97 

1 


TLLPLPPLPDTEGMILLNTGLEGTVAENPVPIVHTPSG^ILTLE 
SCLQQIATHPGHWGIHLQIAEPAALRPSLALLARLSSLGLLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSGYREQLLTDMLELCQGLWQPVSFQMQAMLLGHSTAGAIGRL 
LASS PRATVTVEHNPAGGDYASVRTALLAARAVDRTRVY YRLPQ 
3YHKDLLAHVGRN 
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Amino acid segment containing signal peptide - 
(A=Alanine, C-Cyeteine, D=Aspartic Acid, E*= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=»Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R-Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTE^LYYSYFKTIVEAPSFLNGVWMIMNDKLTEYPLVINT 
LKRFNLYPEVILASWYRIYTKIMDLIGIQTKICWTVTIGEGLSP 
TESCEGLGDPACFYVAVIFILNGLMMALFFIYGTYLSGSRLGGL 
VTVLCFFFNHGECTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
i i K\g^uxj\jjK. ±,&Bi v r c rlLi P v*\ie JxCje VJ^TQIASLFAVYVVGY 
IDICKLRKI1 YIHMISLAIiCFVLMFGNSMLLTSYYASSLVI IWG 
ILAMKPHFLKINVSELSLWVIQGCFWLFGTVILKYLTSKIFGIA 
NDAHIGNLLTSKFFSYKDFDTLLYTCAAEFDFMEK3TPLRYTKT ■ 
LLLPWLVGFVAI VRKI ISDMWGVLAKQQTHVRKHQFDHGELVY 
HALQLLAYTALG I LIMRLKLFLTPHMCVMASLI CSRQLFG WLFC 
KVHPGAIVFAIIiAAMSIQGSANLQTQWNIVGBFSNLPQEBLIEW 
x iv x o j. Arurt v r jvjjM'ttri. s*\j\i> V j\ 1 hSALiK P 1 VNH PH YEDAGLRART 

KIWSMYSRKAAEEVKRELIKLKVNYYILEESWCVTIRSKPGCSM 

PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEW 
KB 


6i78 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVLIVfeXSQQYAWGTVLLL 
IRIILEYCQGVDNIPSVTTDMIiTRLSDLLKYFNSRSCQLVLGAG 
ALQWGLKTITTKNLALSSRCLQL IVHY I PV1RAHFEARLPPKQ 
YSMLRHFDHlTKDYHDHIAEISAiCLVAIMDSLFDICLLSKYEVKA 
PVPSACFRNICKQMTKMHEAIFDLLPEEQTQMLFLRINASYKLH 
L KKQLSHLNV I NDGG PQNGLVTAD VAF YTGNLQALKGLKDLDLK 
MAEIWEQXR 


6279 


127 


16B7 


GGAMASDGARKQFWKRSWSKLPGSIQHVYGAQHPPFDPLLKGTL 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAMDAGBDDDELLAMA 
AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
P PS PP S GDLRLVKS VS ESHTSC P AESAS DAAPLQRSQSLPHSAT 
VTLGGTSD PSTLS S SALSEREAS RLDKFKQLLAG PNTDLEE LRR 

ls wsg i pkpvrpmtwkllsgylpanvdrrpatlqrkqk2yfafi 
ehyydsrndevhqdtyrqihidiprmspealilqpkvteifer: 
lfiwairhpasgyvqgindlvtpffwficeyieaeevdtvdvs 
gvpaevlcnieadtywcmsklldgiqdnytfaqpgiqmkvkmle 
elvsrideqvhrhldqhevrylqfafrwmnwllmrevplrctir 
lwdtyqs e pdg fs h fh l yvcaafl vrwrke i leekd fqe lllfl 
qnlptahwddedislllaeayrlkfafadapnhykk 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVIILAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
w uokujuk i im p v vjua I P DTRE LE F NE I KTQ VE LATGQLGLRRAA 
QKHSFPRMLHQRERGIiCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCX3IYSKDGQIFMSACQDQTIRLYDCRYGRFRKFKSIKA 
RDVG WS VLDVAFTPDGNHFLYSS WSDYIH I CN I YGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESH EDDVNAVAFAD I SSQ I LFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDG X TFIDS KGDAR YLI SNS KDQTIXLWD I RRFSS R 
BGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVWYDIiLSGHIVKK 
LTNHKACVRDVS WHPFEE KI VSS S WDGNLRLWQ YRQAEY FQDDM 
PBSEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE - 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
HDGRLGDR YN P P VDATPDTRE LEFNE I KTQVE LATG QLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YS KDGQIFMSACQDQT I RL YDCR YGRFRKFKS I KA 
RDVGWS VLDVAFTPDGNHFIiYS SWSD Y IHI CN I YGEGDTHTALU 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADI SSQ I LFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDG I TFI DSKGDAR YLISNSKDQTI KLWDIRRFSS R 
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Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine , V-Valine, 
WsTryptophan, Y-Tyrosine, X»Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVWYDLLSGHIVKK 
LTNHKACVRDVSWHPFEEKIVSSSl^DGNIiRLWQYRQAEYFQDDM 
PESEECAS APAP VPQS STP FSS PQ 


6282 


12S 


906 


RMAACRALKAVLVDLSGTLHI EDAAVPGAQEALKRLRGAS VI IR 
FVTNTTKES KQDLLERLRKLE PDI5EDE 1 FTSLTAARS LLERKQ 
VRPMLLVDDRALPDFKGIQTSDPNAWMGLAPEHFHYQILNQAF 
RLLLDGAPLIAIHKARYYKRKDGLALGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
G I L VKTGK YRAS DEE K I NP PPYLTCES F PHAVDHI LQHLL 


6283 


140 


1043 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPS I PKNALPITKPTS PAPAAQSTNGTHAS YGPFYLE YSLL 
AEFTLWKQKLPGVYVQPSYRSALMWFGVIFIRHGLYQDGVFKF 
TVYI PDNYPDGDCPRLVFDI PVFHPLVDPTSGELDVKRA FAKWR 
RIWNHIWQVLMYARRVFYKIDTASPLNPEAAVLYEKDIQLFKSK 
WDSVKVCTARLFDQPKIEDPYAISFSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


1 


2879 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQVERENVQKRTFTRWINLHLEKCNPPLEVKDLFVDIQDGKI 
LMALLE VLSGRNLLHEYKS SSHRI FRIiNNIAKALKFLEDSNVKL 
VS I DAAE IAD GNPS LVLGL I WN I IL FFQ I KE LTGNLSRNS PS SS 
LAPGSGGTDSDSSFPPTPTAERS VAI S VKDQRKAI KALLAWVQR 
KTRKYGVAVQDFAGSWRSGLAFLAVIKAIDPSLVDMKQALENST 
RENLEKAFS I AQDALH I P RLLE PED I MVDTPDE QS I MT Y VAQ FL 
ERFPELEAEDIFDSDKEVPIESTFVRIKETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRIiDGVSSHALSDS 
STE FMHQ 1 1 DQVLQGG PG KTS DISBPSPESSILSSR KENGRSNS 
LPIKKTVHFEADTYKDPFCSKNLSLCFEGSPRVAXESLRQDGHV 
LAVEVAEEKEQKQESSKIPESSSDKVAGDIFIiVEGTNNNSQSSS 
CNGALE STARHDE ES HS LS P PGENTVMADS FQI KVNLMTVEALE 
EGDYFEAI PLKASKFNSDL IDFASTSQAFNKVPS PHETKPDEDA 
EAFENHAEKLGKRS IKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPIjSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDIjKNEEMDLEEPEGYMPDLDSRE 

eeadgsqssssssvpgeslpsasdqvlylsrggvgttpasepap 
laphedhqqretkendpmdshqsqespnlenianpleenvtkes 
isskkkekrkhvdhvesslfvapgsvqssddleedssdysipsr 
tshsdss i ylrrhthrssesdhfslcsveersrsg 


6285 


2157 


1531 


SCKTENLLEMWW FQQGLS FLP S AL V I WTS AAF I FS Y I TAVTLHi l"" 

IDPALPYISDTGTVAPEKCLFGAMLNIAAVLCIATIYVRYKQVH 

ALSPEENVI I KLNKAGLVLG I LS CLGLS IVANFQKTTLFAAHVS 

GAVLTFGMGSLYMFVQTILSYQMQPKIHGKQVFWIRLLLVIWCG 

VSALSMLTCSSVLHSGNFGTDLEGKLHWNPEDKGYVLHMITTAA 

E WSMSF S FFGFFLT Y I R DFQKI SLRVE ANLHGLTL YDTAPCP IN 

NERTRLLSRDI 


6284 


1619 





i^UiA^cCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
PFSIPAASEIADLSNIINKLLKDKNEFHKHVEFDFIiIKGQFLRM 
PLDKHMEMENISSBEWEIEY VEKYTAPQPEQCMFHDDWISS I K 
GAEEWILTGSYDKTSRIWSLEGKSIMTIVGHTDWKDVAWVKKD 
SLS CLLLSASMDQTILLWEWNVERNKVKALHCCRGHAGS VDS IA 
VDGSGTKFCSGSWDKMLXIWSTVPTDEBDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEEICSASWDHTIRVWD 
VBSGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVSLSLTSHTGWVTSVKWSPTHEQQIilSGSLDNIVKLWDT 
RSCKAPLYDLAAHEDKVLSVDWTDTGLLLSGGADNKLYSYRYSP 
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Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, N=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=>Arginine, 
S * Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y=Tyroaine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 
TTSHVGA ~~ ■ — 


6287 


27-8 




My* t ^Nf mGbKSTSGKEKySGDAGFLGDALQLPLQCLALDEDF 
APAiOQVQKILCDLLLPENLKEGLKBSSWSSLPCTKNRPPDPHS 
VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNEDGR 
NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
PVTTPCGHSFCKNCLERCLDHAPYCPLCKESLKEYLADRRYCVT 
QLLEELI VXYLPDELSERKK I YDEETAELSHLTKNVP I FVCTMA 
YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSPTQNSFADY 
GCMLQIRNVHFLPDGRSWDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6288 


1 


743 


VTLYPCRGliVGNLLLGASGMASGCKIGPSILNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 
HMM VSKPE QWV KPMAVAGANQ YTFHLEATEN PG AL I KD I RENGM • 
KVGIAI KPGTS VE YLA P WANQ I DMAL VMT VE PGFGGQ KFMEDMM 

PKVHWLRTQFPSLDIEVIX3GVGPiyrVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6289 
' 6290 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDLANLGAECLR 
MI»DSGAD YLHIiDVMDGHFVPNITFGHP WES LR KQLGQDP FFDM 
HMMVS KPEQWVXPMAVAGANQYTFHLEATENPGALl KD I RKNGM 
KVGLAI KPGTS VEYLAPWANQI DMAL VMT VE PG FGGQKFMEDMM 
PKVH WLRTQFP SLDI E VDGGVG PDTVHKCAEAGANMI VSGSAIM 
RSED PRSVINLLRNVCS EAAQKRSLDR 




3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRRRRMISRYTRKA 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
DITRESSFTSADTGNSLSAFPSYTGAGISTEGSSDFSWGYGELD 
QNATE KVQTMFTAI DELLYEQKLS VHTKS LQ EE CQQWTASFPHL 
RILGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFSSSYAHKASS IAKSSSFCSMERDEEDS I I VSEGI IEEY 
LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDSESSCV 
LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 
MPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLS YTVQSTRRRNPPPRTLHP ISTSHSCAETPRSVEE ILRGA 
RVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVEHVSTVGPQRQMK 
PHGDS SRAQS AWDE PN YQQ PQERLLLPDFFPR PNTTQS FLLDT 
Q YRRS CAVE YPHQ ARPGRGS AG PQLHGS TKSQSGGRP VSRTRQG 
P 


6291 
6292 


1732 


602 


l-VAKMASSASARTPAGKRVINQEELRRLMKEKQRLSTSRKRIBS 
PFAKYNRI^QI^CALCNTPVKSELLWQTHVLGKQHREKVAELKG 
AKEASQGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
A P 1 1 PHSGS IE KAE I H EKWERRENTAE ALPEGF FDD PEVDARV 
RKVDAPKDQMDKEWDEFQKAMRQVNTISEAIVAEEDEEGRLDRQ 
IGE I DEQ I EC YRR VE KLRNRQDEIKNKLKEI LTI KE LQKKEEBN 
ADSDDEGELQDLLSQDWRVKGALL 




1835 


1142 


TCPGAMKMVAPWTRFYSNSCCLCCHVRTGTILLGVWYLI INAW 
LLILLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
L I CAMAT YGA YKQRAA W 1 1 P F FC YQ I FD FALNMLVA I TVT, I Y PN 
SIQEYIRQLPPNFPYRDDVMSVNPTCLVLI ILLFIS I ILTFKGY 
US C VWNCYRY 1 NGRNSSD VL VYVTSNDTTVLLP P YDDATVNGA 
AKEPPPPYVSA 


6293 


2382 


1035 


tWCTiiGTVDVHPlGWCAINSKILVPPRTIHAKFTDWKGYLMKRL " 
VGS RTLPVDFHI KM VESMKYP FRQGMRLEWDKSQVS RTRMAW 
DTVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\«poosible nucleotide insertion) 








MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 
LEAIDPIiNLGNICVATVCKVIiLIXSYLMICVDGGPSTDGLDWFCY 
HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRVVHRL 
LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 
PQGAR KIS S E P VPG E I IAVRVKEEHLDVAS PDKAS S P ELPVSVE 
NIKQETDD 


6294 


354 


1814 


AQLTTRGRTVAGGVRWI PSPFPDLELYSCCLGTDRGFPELSHHC ' 

KNVIATASDYDMAE I TNIRPS FDVS PWAGLIGASVLWCV8VT 

VFVWSCCHQQAEKKHKNPPYKFIHMLKGI5IYPETLSNKKKIIK 

VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 

LP I KMD YG EELRS P I TS LTPGE S KTTS P S S PEEDVMLGS LTFS V 

DYNFPKKAIiWTIQEAHGLPVMDDQTQGSDPYlKMTILPDKRHR 

VKTRVLRKTLDP VFDETFTFYG I P YSQLQDLVLHFLVLS FDRFS 

RDDVIGE VMVPLAGVDPSTGKVQLTRDI I KRNIQKCISRGELQV 

SLSYQPVAQRMTVWLKARHLQKMDIAGLSGNPYVKVNVYYGRK 

RIAKKKTHVKKCTLNPIFNESFIYDIPTDLLPDISIEFLVIDFD 

RTTKNEVVGRLILGAHSVTASGAEHWREVCESPRKPVAKWHSLS 


6295 


279S 


617 


VS S ALLTGATSGSDAAKSEGAS AS PLS CTNAVAMDR PDEG P PAK 
TRRLSSSESPQRDPPPPPppppLLRLPLPPPQQRPRLQEETEAA 
QVLADMRGVGU3PALPPPPPYV I LEEGGIRAYFTLGAECPGWDS 
T I E SGYGEAP P PTES LEALPTPEAS GGSLE IDFQ WQSS S FGGE 
GALETCSAVGWAPQRLVDPKSKEEAI 1 I VEDEDEDERESMRS S R 
RRRRRRRRKQRKVKRESRERNAERMBSILQALEDIQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFliERRDIjI IQHI PGFWVKAFLNHPR 
I S I L INRRD EDI FR YLTNLQVQDLRH I SMG YKM KL YFQTNP Y FT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHSLPEADRIAEIIKNDLWVNPLRYYLRERGSRIKRKK 
QEMKKRKTRGRCEWIMEDAPDYYAVEDIFSEISDIDETIHDIK 
ISDFMETTDYFE TTDNE I TD I NEN I CDS ENPDHNE VPNNETTDN 
NESADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNENTY 
GNNFFKGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRD I EYYEKVI EDFDKDQADYEDVI E I I 
SDESVEEEGIEEGIQQDEDIYEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNGWANPGKRGKTG 


€294 


727 


1199 


RHCGCDAQGACDS LPPTGTSSPVTARNAI PEARCCVWLLDGTTV 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSLQFPSPFSGTISFGSFSDSGIFPLGSQCCLGFQQFSISGK 
KWAL I HKRVRLS VFGARWGR I YFGK 


6297 


1 


922 


QRAAAAS PS SCG PRGAE YGALMAMEGYWRFLALLGS ALL VG FLS 
VIFALVWVLHYREGLGWDGSALEFNWHPVLMVTGFVFIQGIAII 
VYRLPWT WKCS KLLMKS I HAGLNAVAAI LAI IS WAVFENHNVN 
NIANMYSLHSWVGLIAVICYLLQLLSGFSVFLLPWAPLSLRAFL 
M P IH VYSG I VI FGTV I ATALMGLTEKL I FS LR DPAYS TFP PEGV 
* v «i«AJuiii,i»vf \j/uiifWAv A«rywj\KPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELIWEVAARKRNLALDEAGQRSTM 


6298 


3 


985 


SVPLRRLSLSGTI^GAGTTTKMAVARLAAVAAWVPCRSWGWAAV 
P FGPHRGLS VLLAR I P QRAPR WLPACRQKTS LS FLNR PDLPNLA 
YKKLKGKS PGI I FI PG YLSYMNGTKALAI EEFCKSLGHACIRFD 
YS G VGSS DGNS E ESTLGKWRKDVLS 1 1 DDLADGPQ I L VG S S LGG 
WLMLHAAIARPEKVVALIGVATAADTLVTKFNQLPVELKKEVEM 
KG VW SMPS KYSEEG VYNVQYS F I KEAEHHCLLHS P I PVNCP IRL 
LHGMKDDI VPWHTSMQ VADRVLS TD VDV I LRKHSDHRMRE KADI 
QLLVYTIDDLIDKLSTIVN 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i Predicted end 
| nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 

Glutamic Acid. PsPh^nvl Alanine fl-JZI \*r*4 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Tyrosine XolJnWnoum *_et- n w 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6299 


512 


814 


ECDIjEG^PNVTISLSLPTNGSPL0DTi\VHPrvT<5T rxs&TT tcc 

SIDAMDDSAFSGPYKFPFTPPLESFNLCPTTSQVPVPPILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPS CWSQRGVPAAGTPSSPRLLVSRAAAPS AG PWGAWRQGARA 
AQS P ?S IPN S SS VPYGS QDS VHS S PEDGGGGRDR PVGGS PGGPR 
LVIGS LPAHLS PHMFGGPKCPVCS KFVSSDEMDLHLVMCLTKPR 

ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKYLRCYRCLLETKELGCLLGSDlCLTP " 

AGSSCITLHKKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGPW 

IFSQYCFLDFCNDPQNRGLYTP 


6302 


490 


745 


IFGFLHLFHMEHSFLLVCALFAHVFFSSSCGSSVALHSDPCLLS " 
PVLLNCLPGDLRPLDELYAQKLKYKAISEELDHALNDMTSL 


6303 


2 


1961 


YKNEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL 
YWY Y LEQFQ YWEAQGWT FDASQSCDTDTYTS KTEADDKNDEKCM 
KVDLVS FLSSP IMGDNDSSGTSDKDHSEILDGISNIKLNSEEVT 
QSQLDS CTSHDGHQQLS E VS S KRE C PASGQS E PRNGGTNEESNS 
iw i UFPAbUoU^SGANTSKDRPHASGTDGDESEEDPPEHK 
PS KLKRSHE LD I DENPASD FDDSGS LLGFKYG SGQK YGG I PNFS 
HRQ VR YLEKNVKL KSKYLDMRRQ IKMKNXHI FFTKES EKP FFKK 
SKILSKVEKFLTJWNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGEIjETENYERDSIilATV 
PDEODCVTQEVPPSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDGIKLDREGMFSVTPEKIAEHIAGR 
VSQSFKCDVWDAFCGVGGNTIQFALTGMRVIAIDIDPVKIALA 
RNNAE VYG I ADK I E FICGDFLLLAS FL KAD WFLS P P W3GPD YA 
TAETFDIRTMMS PDGFEI FRLSKK1TNNI VYFLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


ntu » vjJK&tOSc rOvjUUKMl'OKVKKUI TXjSGHPRLSTQHWLLRE 

devgd pgtkdlghpqhgs p iqetqse wtlvsplpgsdmaalpa 
wratsgltlwphtaegrdllgaenraltggqqaedptiasgayq 

WPGSVElfTjnnQ\/^JPnMrTT.T.CCCO , TV i r»r»nT3 0MT *r*T\Tlm m-Ktv nr r 

v mujwuavwnw£iUjjj&{>&Kll»b^APt > WL I PHPVQMTiRLrL 

AQGEWDKARVPAHGQVLQVGFSTEAALQDLSS PRLSQLCSQGt 
CGLIKRPGDLPEVLSFHVDRVLGLRRSLPAVARRFHSPLLPYRY 
TDGGAR P VI WWAPD VQHLS DP D EDQNS LALGWLQ YQALLAHS CN 
WPGQAP CPG IHHTE WARLAL FDFLLQVHDRLDR YCCGFEPEPSD 
PCVEERLREKCRNPAELRLVHILVRSSDPSHLVYIDNAGNLQHP 
EDKLNFRLLEGIDGFPESAVKVLASGCLQNMLLKSLQMDPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


99 


420 


NMIWRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPQGEEPPTES 
RD PAPGQ ERE EDQGAAETQ VPDLEADLQ ELS QS KTGDECGDG PD 
VQGKILTKSEQFKMPEGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGIiFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSEADKSALMD 
ESSDSGVI PGSHSENALHASEEEEGEGGKAQSSLGYI PLMRWQ 
SVRHTTRKSSTTLREGWVVHYSNKDTLRKRHYWRLDCKCITLFQ 
NNTTNR Y YKE I PLS E 1 L TVES AQNFSLVP PGTNPHCFE I VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQAS LS I SVSNSQI QENVDIATVYQ I FPDE VLGSGQF 
GVVYGGKHRKTGRDVAVKVIDKLRFPTKQESQLRNEVAIXiQSLR 
HPGIVNLECMFETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 
TKFLITQ I LVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFARI IGEKSFRRS WGTPAYLAPEVLLNQGYNRSLDMWSVG 
VIMYVSIjSGTFPFNEDED INDQ IQNAAFMYPAS PWSH I SAGAID 
L INNLLQ VKMRKRYS VDKSLS H P WLQE YQ TWLDLRELEGKMGER 
YITHESDDARWEQFAAEHPLPGSGLPTDRDIiGGACPPQDHDMQG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I»Isoleucine, K=Lysine, 
Leucine, M«Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
LAERXSVL ~ " 


6307 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKVVRQSXFRKVFG" 
yrvwsiuyL i OJJiKVaKVTWDSTr CAVNPKFLAVIVEASGGGAFL 
VLPLSKTGRIDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDC 
TVMVWQ I P ENGLTS PLTE P VWLEGHTKRVG 1 I AWHPTARNVLL 
SAGCDNVVLIIWVGTAEELYRLDSLHPDLIYNVSWNHNGSLFCS 
ACKDKSVRI IDPRRGTLVAEREKAHEGARPMRAI FLADGKVFTT 
G FS RMS ERQ LALWDP ENLE EPMALQELDSSNG ALLPFYD PDTS V 
VYVCGKGDSSIRYFE ITEEPPY1 HFLNTFTSKEPQRGMGSMPKR 
GLEVS KCE IARFYKLHERKCEPI VMTVPRKSDLFQDDLYPDTAG 
PEAALEAEEVrVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DS RP AMAPG S SHLGAPASTTTAADATPSGS LARAGEAG KLE E VM 
QELRALRALVKEQGDRI CRLEEQLGRMENGDA 


63D8 


2 


1118 


GRPTRPEKMLLSLVLHTYSMRYLLPSWLLGTAPTYVLAWGVWR ' 

LLSAFL P ARF YQALDDRL YC VYQSMVL FFFEN YTG VQI LL YGDL 

PKNKENIIYLANHQSTVDWIVADILAIRQNALGHVRYVLKEGIJC 

WLPIjYGWYFAQHGGIYVKRSAKFNEKEMRNKLQSYVDAGTPMYL 

VIFPEGTRYNPEQTKVLSASQAFAAQRGLAVLKHVLTPRIKATH 

VAFDCMKNYLDAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 

KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMIilEFYESPD 

PERRKRFPGKSVNSKLSIKKTLPSMLILSGLTAGMI»MTDAGRKL 

YVNTW I YGTLLGCLWVTI KA 


6309 


220 


563" 


LVAEVKEPCSIjPMLSVDMENKENGSVGVKNSMENGRPPDPADWA 
VMDWNYFRTVGFEEQASAFQEQEIDGKSLLLMTRNDVLTGLQL 

KLGPALKI yeyhvkplqtkhlknnss 


6310 


36 


979 


GPRCWKFLILSSVNCETLRIGKAWPQSfiGQERYWTPRTHSSAS2"" 
AQRGSLA3LNVAAAGLWADCDQPLYDCPMCGLI CTNYHILQEHV 
DLHLEENSFQQGMDRVQCSGDLQLAIIQLQQEEDRKRRSEESRQE 
I EEFQKLQRQYGLDN3 GGY KQQQLRNME I EVNRGRM P PS E FHRR 
KADMMESLALGFDDGKTKTSGI IEALHRYYQNAATDVRRVWLSS 
WDHFHSSLGDKGWGCGYRNFQMLLSSLLQNDAYNDCI..KGMLIP 
CI PK I Q SMI EDAWKEG FD PQGAS QL I IR LQGTKAWIGACEVY I L 
LTSLRV 


6311 


1 


675 


PWWNSCEGPRIAAAARTGHGVGRRARIACLGEPRVKAAVKIjTXj 
ASKLKRDDGLKGSRTAATASDSTRRVSVRDKLLVKEVAELEANL 
PCTCKVHFPDPNKLHCFQLTVTPDEGYYQGGKFQFETEVPDAYN 
MV? PKVKCLTKI WHPNITETGE I CLSLLREHS IDGTGWAPTRTL 
iujv vH^uwaur lULihNr JJOPl4WXr^iAEHHltRDKEDFRNXVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKMLPGVGVFGTGSSARVLVPLLRABGFTVEALW 
GKTEEEAKQLAEEMNI AFYTS RTDD I LLHQD VD LVCI S I PPPLT 
RQ I S VKALG I GKNWCEKAATS VDAFRM VTAS RYY PQLMS LVGN 
VLRFLPAFVRMKQL I SEHYVGAVM I CDARI YS GSLLS P S YGW I C 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
I RG I RHVTS DDFC FFQMLMGGGVCSTVTLNFNMPGAF VHE VM W 
GS AGRLVARG ADLYGQ KNS ATQE ELLLRDS LAVGAGLP EQGPQD 
VPLLYLKGM V YMVQALRQSFQGQGDRRT WDRT P VSMAAS FEDGL 
YMQSWDAI KRSSRSGEWEAVBVLTEEPDTNQNLCEALQRKNL 


6313 


2 


2071 


QRSGAARLAFLPS P FS P ACVHRS PLS FHGCWF Y F VW FM PLG VL " 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RLNEQREQDRFTDI TL I VDGHH FKAHKAVLAACS KFFY KFFQ EF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKIjMIQGEEEANDVWKAAE 

flqmleai kale vrnkensapleenttgknbakkrki aetsnvi 
teslpsaesepveieveiaegtievedegietleevasakqsvk 

YIQSTGSSDPSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 

qvegieivblqlshvkdlfhcekcnrsfklfyhfkehmkshste 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or r e s pond i ng 

to first 

amino acid 

residue of 

amino acid . 

sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C-Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F* Phenylalanine, G=Glycine, 
H=Hietidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W»Tryptophan, YoTyrosine, X«Unknown, *»Stop 
Codon, /«pos9ible nucleotide deletion, 
\-possible nucleotide insertion) 








SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTG VG AKKGRKKLYECQ VCNS VFNS WDQFKDH L VI HTGD K 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TE PVTSMTI I EQVG KVHVLP LLQVQ VDSAQVT VEQVHPDLLQDS 
QVHDS HMSE L P EQ VQ VS YLE VGR I QTEEGTE VHVEELHVE RVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


! 2071 


QRSGAARLAFLPSPFSFACVHRSPLSFHGCWFYFVVVFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKXiMIQGEEEANDVWKAAE 
FLQMLEAIKALEVRNKBNSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVBIAEGTIEVEDEGIETLEEVASAKQSVK 
Y I QS TGS SDDS ALALLAD I TS KYRQGDRKGOI KE DGCP SD PTS K 
Q VEG I E I VE LQLSHVKDL FHCEKCNRS F KLF YH FKEHMKSHSTE 
SFKCEICNiCRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCE KQFDHFGHFKEHLRKHTGE KPFECPNCHER FARNS TLKCH 
LTACQTG VGAKKGRKKLYECQVCNSVFNS WDQFKDH LVI HTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMTIIEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 


LGLAVWWTTLVLX SYCPTATEBAPYWtYIjIiCALGLFI YQSLDA 
IDGKQARRTNS CSPLGELFDHGCDSLS TVFMAVGAS IAARLGTY 
PDWFFS CSPIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIALVI 
VFVLSAFGGATMWDYTI P ILE IKLKI LPVLGFLGGVI F5CSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGLIIILAIMIYKKSATD 
VFEKHP CL YILM FGCV FAKVS QKLWAHMTKS E L YLQDT VFLGP 
GLLFLDQ YFNNF I DE YWLWMAMV IS S FDM VI Y FS ALCLQ I S RH 
LHLN I FKTACHQAPEQVQVLS S KSHQNNMD 


6316 


1503 


792 


VS AGAGTG I MGGTTSTRRVTFEADENEN ll TWKGI RLS ENV t DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRAILRERI CSEEERAKAKHL 
ARQLEEKDRVLKKQDAFYKEQLARLEERSSE FYRVTTEQYQKAA 
EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLEKGG 


6317 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASAQbARYGQXDSSDQN " 
FDYM FKLLI IGNS S VGKTS FLFRYADDS FTSAFVST VG IDFKVK 
TVFKNEKRI KLQI WDTAGQERYRTITTAYYRGAMGF I LMYDITN 
EES FNAVQD WS TQ I KT YSWDNAQVILVGNKCDME DERV I STERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAITAAKQNTRLKETPPPPQPNCAC 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLELRREAPPL 
LGPLLS P FPLPAGS WHRQMLRSSLRFP ITNS AGA PCXAAGRMN I 
LAP\ZRRDRVLAELPQCLRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLG I PFSLQLWDTAGQERFKCI ASTYYRGAQAI 1 1 VFNLN 
DVASLEHTKQWLADALKENDP5SVLLFLVGSKKDLSTPAQYALM 
EKDALQVAQEMKAEYWAVSSLTGENVRBFFFRVAALTFEANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLLGKKVVLVPYTSEHVPSRYkEWMKSEELQRLT 
AS E PLTLEQE YAMQCS WQEDADKCTF I VLDAE KWQAQ PGATEES 
CMVGDVNLFLTDLEDLTLGEIEVMIAEPSCRGKGLGTEAVLAML 
S YGVTTLGLTKFE AK I GQGNEPS I RMFQ KLH FEQ VATS S VFQE V 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S«Serine, T»Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, +*Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESEHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAMAAVDSFYLLYREIARSOyCYMEALALVGAWYTA 
RKS I TV3 CD FYSLI R LHFI PRLGSRADL I XQ YGRWA WS GATDG 
IGKAYAEEIiASRGLNI ILISRNEEKLQVVAKDIADTYKVETDI I 
VADFSSGRE I YLPIREALKDKDVGILVNNVGVFYPYPQYPTQLS 
EDKLWDI INVNIAAASI^lVHWLPGMVERKKGAIVTISSGSCCK 
PTPQLAAFSASKAYLDHFSRALQYEYASKG2 FVQSLIPFYVATS 
MTAPSNFLHRCSWLVPSPKVYAHHAVSTLGISKRTTGYWSHSIQ 
FLFAQ YM P E W L W VWGAN I LNRSLR KEALS CTA 


6321 


1418 


341 


HRKAAU3 ALMAGRLLGKALAAVS LS LALAS VTI RS S RCRG I QAF 
RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWIiVEWQDYKPVEYTAVSVLAGPRWADPQISESI^FSPK 
FNE KDGHVERKSKNGLYE IENGRPRNPAGRTGLVGRG&LGRWGP 
NHAADPIITRWKRDS3GNKIMHPVSGKHILQFVAIKRKDCGEWA 
IPGGMVDPGEKISATLKREFGEEALNSLQKTSAEKREIEEKLHK 
LFSQDHLVIYKGYVDDPRNTDNAWMETEAVNYHDETGBIMDNLM 
LEAGDDAGKVKWVDINDKLKLYASHSQFIKLVAEKRBAHWSEDS 
BADCHAL 


6322 


2047 


1083 


NQEILKNVESSRTVQPHFLEFLLSLGWSVDVGRHPGWTGHVST3 
WS I NCCDDGEGSQQE E VI S S EDI GAS I FNGQKKVLYYABAI/TEI 
AFWPSPVESLTDSLESNISDQDSDSNMDLMPGILKQPSLTLEL 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSWWVB 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTLEKEVPVIF 
IHPLNTGLFRIKIQGATGKFNMVIPLVDGMIVSRRALGFLVRQT 
VI N I CRRKRLES DS YS PPHVRRKQ KI TDI VNKYRNKQLE P EFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PASTTDGAQEAR VPLDG AF W I PRP PAGS PKGCFAC VS KP P ALQA 
P AAPAP E PS AS P PMAPTL FPMES KS S KTDSVRAAGAPPAC KHLA 
EKKTMTNPTTVI E VY PDTTE VNDY YLWS I FNFVYLN FCCLGFI A 
IiAYSIiKVRDKKLLNDLNGAVEDAKTDRLINITRSGLAAS CI MLW 
MALS VIATHRGLRSSAS IL VAEPHDWNTERPQVT FRERCPAL 


~6T24 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRS DLQ FQPEEAS PWTQ PGVHG P 
WTELETHGSQTQPERVKSWADNLWTHQNSSSLQTHPEGACPSKE 
PSADGSWKELYTDGSRTQQDIEGPWTEPYTDGSQKKQDTEAARK 
Q PGTGGFQ I QQDTDGS WTQPSTDG SQTAPGTDCL LGE PE DG P LE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYS PFWS FRKHYPWVQLSGHAGNFQAGBDGRI LKRFCQC 
EQRS IiEQLM KDP LRPFVPA YYGMVLQDGQTFNQMEDLIiADFEG P 
SIMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 
EEHAQGAVTKPRYMQWRETMSSTSTLGFRIEGIKKADGTCNTOF 
KKTQALEQVTKVLEDFVDGDHVI LQKYVACLEELREALEIS P FF 
KTHEWGSSLLFVHDHTGIAKVWMIDFGKTVALPDHQTLSHRLP 


6325 


165 


944 


GLRDPFRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRKRHQ 
R KYRR YSRS YSRS R S RSRS RHYRERR YG FTRRY YRS PSR YRSRS 
RS RSRS RGRS YCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRLSEKDRMELLEIAKTNAAKALGTTNIDLPASLRT 
VPS AKETS RG I G VS SNGAKPE VS I LG LSEQN FQKANCQI 


^326 


238 


680 


GE PS PATQQKPS ATGAGVLHOHFSSGH I YVUMGLL P P PWTIS FT 
VQTTLQP PGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRTQTL 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 



486 



WO 01/53312 



PCIYUS00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
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amino acid 
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Predicted end 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N^Asparagine, 
PeProline, Q=Glutamine, R«Arginine, 
SaSerine, T=> Threonine, V^Valine, 
WtsTryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








QAWGGVGQEASSGVP 


6327 


1 


1337 


SLARLAPAGGSVVMPTQQPAAPSTRAPKPSRSLsGSIjCALFSDA 
DSGSGMKAELPPGPGAVGREMTKEEKLQLRKEKKQQKKKRKEEK 
GAEPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPE YPQVDDL LLRRLVKKPERQQ VPTRKD YGS KVS L FSHLPQY S 
RQNSLTQFMSIPSSVIHPAMVRLGIiQYSQGLVRGSNAXCIALLR 
ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFLTQCRPLSASMHN 
AIKFLNKE1TSVGSSKREEEAKSELRAAIDRYVQEKIVLAAQAI 
S R FAYQK I S NGD VI L VYG CS S LVSRI LQE AWTEGRR FRVWVDS 
R P WLEGRHTLRSLVHAG VPAS YLL I PAA5 YVL PE VS TEBKDS KV 
GGEKV 


6326 


1030 


276 


HASAEVTTAAARGLGAMEEEMHTDAKIRAENGTGSSPRGPGCS1. 
RHFACEQNLLSR PDGS AS FLQGDTS VLAGVYGP AEVKVS KE I FN 
KATLEVIIjRPKIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSI 
TWLQWS DAGS LLACCLNAACMALVDAG VPMRALFCGVACALD 
S DGTLVLDPTSKQEKE ARAVLTFALDS VERKLLMS STKGLYSDT 
ELQQCLAAAQAASQH VFRFYRES LQRRYS ICS 


6329 


3 


2016 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGLNNGAGGTSATT 
SNPLSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDIERKSLAINEEFVSIFKEVKEELESISEDVQAMSNCCQDMT 
S RLQAAKEQTQDLI VKTTKLQSESQKLE I RAQVADAFLS KFQLT 
S DEMSLLRGTREGP I TEDFFKALGRVKQIHNDVKVLLRTNQQTA 
GLEIMEOWAIiLOETAYERLYRWAO^FCRTLTrjP^rriVQovT Tna 
MEALQDRPVLYKYTLDE FGTARRSTWRGF IDALTRGGPGGTPR 
P I EMHS HD PLR YVGDMLAWLHQATAS E KEH LE ALLKHVTTQG VE 
ENIQEWGHITEGVCRPLKVRIEQVIVAEPGAVLLYKISNLLKF 
YHHT I SG I VGNSATALLTT I EEMHLLSKK I FFNSLSLHASKLMD 
KVELPPPDIjGPSSALNQTLMLLREVLASHDSSVVPLDARQADFV 
QVLS CVIiDPLLQMCTVSASNLGTADMATFMVNS LYMMKTTLALF 
E FTDRRLEMLQFQI EAH LDTL I N E QAS YVLTR VGLS Y I YNTVQQ 
HKPEQGSIANMPNLDSVTLKAAMVQFDRYLSAPDNLLIPQLNFL 
LSATVKEQIVKQSTELVCRAYGEVYAAVMNPINEYKDPENILHR 
SPQQVQTLLS 


6330 


1151 


333 


FFYYTFYENKTFSRKMVAEKETIiSLNKCPDKMPKRTKLliAQQPI* 
PVHQPHSLVSEGFTVKAMMKNSWRGPPAAGAFKERPTKPTAFR 
KFYERGDFPIALEHDSKGNKIAWKVEIBKLDYHHYLPLFFDGLC 
EMTFP YEFFARQGIHDMLBHGGN KIIjP VLPQLI I P IKNALNLRN 
RQVICVTLKVLQHLVVSAEMVGKALVPYYRQILPVLNIFKNMNV 
NSGDGIDYSQQKRENIGDLIQETLEAFERYGGENAFINIKYWP 
TYESCLLN 


6331 


3 


4 95 


QQGQRVRTRGRRACASATPLEGCVDLSYPRTHAALLKVAQMVTL 
LIAFICVRSSLWTNYSAYSYFEWTICDLIMILAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNQSGIjVAGA 
I KG FMATFLCMAS I W LS YK I S CVTQSTDAAV 


6332 


1 


878 


.Vl'g^tikFDLVSFiPtLRERIYSN^OVARQFilSWILVLgSVPDI 
NLLD YLPE I LDGLFQ I LGDNGKE I RKMCEWLGE FLKE IKKNPS 
SVKFAEMANILVIHCQTTDDLIQLTAMCWMREFIQLAGRVMLPY 
S SG I LTAVLP CLAYD DR KKS I KE VANVCNQSLM KLVTPEDDE LD 
ELRPGORQAE P TPDDAL PKQEGTASGE WTPSLHLTSCRG PRE PD 
VIGVALGPHLSNQDYFMYVTHTIVAATQRSGSSGSPPFCRQDTG 
KLSTMATHSQLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLLAPLIPPRSAG 
QPLTFSPSGRQPJdRSIjLVGMCSGSGRRRSSLSPTMRPGTGAERG 
G LWMGH PGMH YAPMGMHPMGQRANMP P V PHGMMPQMMP PMGG PP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, C«Cyeteine, D=Aspartic Acid,' E* 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H-Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
PeProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WcTryotoohan. Y=Tvrosine x-rrnimrtum *^,of-™> 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








GAKSMWTEHKSPDGRTYYYNTBTKQSTWEKPDDLKTPAEQLLSK 
ePWKEYKSDSGXPYYYNSQTKBSRWAKPKELEDLEGYQNTIVAG 

AAEAAAAWAAAAAAAAAAAAANANASTSASNTVSGTVPVVPBP 
E VTS I VATVVDNENTVT I S TE EQAQLTST PA I QDQS VEVS SNTG 

EETSKQETVADPTPKKEEEESQPAKKTYTWNTKEEAKQAFKELL 
KEKRVPSNASWEOAMKMTTMnPRVQaT.aTrr Q?vvnAmivtnmT 
EKK 


6334 


. 17 


644 


GGNPSGRAAGFAAAAMPS S PLRVAWCS SNQNRSMEAHN1 LSKR 
GFSVRSPGTGTHVKLPGPAPDKPNVYDFKTTYDQMYNDLLRKDK 
ELYTQNGII.HMLDRNKRIKPRPBRFQNCKDLFDLILTCEERVYD 
w w viit/jji 1 * oAayA i uyf vh v vw VlJAyL/WHh»I&ATIjGAFiiICEIjOQC 
IQHTE DMENB I DELLQE FE E KSGRTFLHT VC FY 


633S 


82 


529 


AARARPGVLCCRLLGAALGDQSRVEMS YI PGQP VTAWQRVE IH 
KLRQGENLILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
la fAo XAuijy HjDai MQWGWDMTMVTHDQARKRLTKRSEEVVRL 
LVTRQSLQKAVQQSMLS 


4336 


1003 


438 


hepaskgraevgnmrlsvaaaishgrVfrrmglgpesrihllrn 

LLTGLVRHERIEAPWARVDEMRGYAEKLIDYGKLGDTNERAMRM 

ADFWLTEKDLI pklfqvlaprykdqtggytrmlqipnrsldrak 

MAV I E YKGNCLPPLPL PRRDSHLTLLNQIiLQGLRQDLRQS QEAS 
NHSSHTAQTPGI 


" 6337 ■ 


76 


524 


EG I QMI>S VQPDTKPKGCAGCNRKI KDRYLLKALDKYWHEDCLKC 
ACCDCRLGEVGSTLYTKANLILCRRDYLRLFGVTGNCAACSKLI 
PAFEMVMRAKDNVYHLDCFACQLCNQRFCVGDKFFLKNNMILCQ 


^33 3 


66 


1349 


APNSESGIWPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP" 
GLRLALLLLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVWAKNY 
Kl^FKKYEVLALLYHEPPEDDKASQRQFEr4EELILEliAAQVLED 
KGVG FG LVDS EKDAAVAKKLGLTE VDSMYVFKGDE VI EYDGE FS 
ADTIVEFLLDVLEDPVELIEGERELQAFENIEDEIICLIGYFKSK 
DSEH YKAFEDAAEEFHP Y I P FFATFDS KG AKKLTLKLNE I DFYE 
AFMEEPVTIPDKPNSEEEIVNFVEEHRRSTLRKLKPESMYETWE 
DDMDG I HI VAFAEE ADPDGFEFLETLKAVAQDNT ENP DLS I IWI 
DPDDFPLLVPYWEKTFDIDLSAPQIGWNVTDADRLWMEMDDEE 
DLPSAEELEDWLEDVLEGE INTEDDDDDDDD 


6339 


245 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSQGAMKAFH 
iluy " iJ - LJ v r v oxjrtivc UUS aVaCtUl Vc lUUWUt AEFEDVMEDS 
VTES PQRVI ITEDDEDETTVELBGQDENQEGDFEDADTQEGDTE 
SEPYDDEEFEGYEDKPDTSS SKNKDPI TI VDVPAHLQNSWES YY 
LEILMVTGLLAY I MN Y I IGKNKNSRLAQAWFNTHRELLESNFTL 
VGDDG TNKEATSTG KLNQENEH I YNLWCSGR VCCEGML I QLR FL 
KRQDLLNVLARMMR P VS DO VO T KVTMNTJT?nMryr vv it a vr-To vat 
VR LQKEMQDLS E F CSDKP KS GAKYGIiPDS LAI LS EMGE VTDGMM 
DT KMVHFLTHYAD K I ES VHFS DQ FSGP KIMQEEGQP LKLPDT KR 
TLLLTFNVPGSGNTYP KDME ALLPLMNMV I YS IDKAKKFRLNRE 
GKQKADKNRARVBENFLKLTHVQRQEAAQSRREEKKRAEKBRIM 
NEEDPEKQRRLEBAALRREQKKLEKKQMKMKQIKVKAM 




2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTSSTFRAERS FHSSSSS 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
P ARPGG AGNI KTLGDAYEFAVD VRDFS PED I IVTTSNNHIEVRA 
EKLAADGTVMNNFAHKCQLPBDVDPTSVTSALREDGS LTIRARR 
HPHTEHVQQTFRTEIKI 


6341 

• 


2 


S45 


KMAVLSAPGLRGFRILGLRSSVGPAVQARGVHQSVATDGPSSTQ 
PALPKARAVA P KPS SRGE YWAKLDDL VN WARRS S LW PMTFGLA 
CCAVEMMHMAAPR YDM DR FG W FRAS PRQSD VMI VAGTIiTNKMA 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=»Glycine /€ 
H=Histidine, I-Isoleucine, K«Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=?roline, Q*Glut amine, R=Arginine, 
S=Serine, T«Threonine, VsValine, 
W=Tryptophan, Yeiyrosine, X« Unknown, *=Stop 
Codon, /=»possible nucleotide deletion, 
\wpossible nucleotide insertion) 








PALRKVYDQM ? EPR YWSMGSCANGGG Y YHYS YS WRGCDR I VP 
VDIYIPGCPPTAEALLYGILQLQRXIKRERRLQIWYRR 


6342 


2 


1191 


DPRVRAMLATLARVAALRKTCLFSGRGGGRGLWTGRPQSDMNNI 
KPLEGVKI LDLTRVLAGPFATMNLGDLGAEVI KVERPGAGDDTR 
TWG P P FVGTES TYYLS VNRNKKS I AVN I KDPKG VKI IKEIAAVC 
DVFVENYVPGKLSAMGLGYEDIDEIAPHIIYCSITGYGQTGPIS 
QRAG YDA VASAVSGLMHI TGPEVACLSHI AANYLIGQK EAKR WG 
TAHG S IVPYQAFKTKDGYI WGAGNNQQFAT VCKILDLP ELIDN 
S KYKTNHLRVHNRKE L I KI LSERFEEELTS KWL YLFEGSGVPYG 
P INNMKEJVFAEPQVLHNGLVMEMEHPTVGKIS VPGPAVRYSKFK 
MSEARPPPLLGQHrTHILKEVLRYDDRAIGELLSAGWDQHETH 


6343 


2 


936 


GTAMVSDEDELNLLVI WDANP I WWGKQALKES QFTLSKC IDAV 
MVLGNSHLFMNRSNKLAVIASHIQESRFLYPGKNGRLGDFFGDP 
GNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIKGQHT 
ETLLAGSIAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMNVIFAAQKQNILIDACVLDSDSGLLQQACDITGGLYIjKV 
PQMPSLLQYLLWVFLPDQDQRSQLILPPPVHVDYRAACFCHRNL 
IBIGYVCSVCLSIFC^FSPICTTCErAFKISLPPVLKAKKKKLK 
VSA 


6344 


2508 


147 " 


TMPTATI^NLRGYGMASPGLAAPSLTPPQLATPNLQQFFPQATR 
QSLLGPPPVGVPr<4NPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 

PAPEPEPCEASELPAKRLRS see ptekeppgqlqvkaqpqarmt 

VPKQTQTPDLLPEALEAQVLPRFQPRVLQVQAQVQSQTQPRIPS 
TDTQ VQ PKLQ KQAQTQTS PEHLVLQQ KQ VQPQLQQ3AE PQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 

PHTQ PQ VSLLAPEQT P WVHVCGLEM P PDAVE AGGGMEKTL PE P 
VGTQVSMEE I QNES ACGLDVGECENRAREMPGVWGAGGS LKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKOALQFFCY 
I CKAS CS S QQ E FQDHMSE PQHQQRLGE I QHMSQACLLSLLP VPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVCTOYFKTPRKFVEHVKSQGHKDKAKELKSJUEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGS ETYSPNTAYGVDFLVPVMGYICRI CHKFYHSNSGAQL 
SHCKSLGHFENLQKYKAAKNPSP!TTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


^34* 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEMEEM I EQLQEKV 
HBLEKQNDTLKNRL I SAKQQLQTQGYRQTPYNNVQSRINTGRRK 
ANENAGLQECPR KG I KFQDAD VAETPHPMFTKYGNS LLEEARGE 
I RNLENV I QSQRGQ I EELEHLAE I LKTQLRRKENE I E LSLLQ LR 
EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 
QRTLKISHDALMANGDELNMQLKEQRLKCCSIiEKQLHSMKFSER 
RIEELQDRINDLEKERELLKBNYDKLYDSAFSAAHEEQWKLKEQ 
QLKVQIAQLETALKSDLTDKTEILDRLKTERDQNEKIjVQENREL 
QLQYL EQKQQLDELKKR I KL YNQEND I NADELSE ALLLI KAQKE 
QICNGDLSFIiVKVDSEINKDLBRSMRELQATHAETVQELSKTRNM 
LIMQHKINIQYO^EVEAVTRKMENIiQQDYELKVEQYVHLLDIRA 
ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 
ENIjFE I H I NKVTFSS E VLQAS GD KE PVT FCT YAFYDFELQTT P V 
VRGLH PE YNFTSQ YL VHVNDL FI*Q Y IQKNTI TLEVHQAYSTE YE 
TIAACQLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 
RV5MDQAIRLYRERAKALGYITSNFKGPEHMQSLSQQAPBCTAQL 
SSTDSTDGNLNELHITIRCCNHLQSRASHLQPHPYWYKFFDFA 
DH0TAI IPSSNDPQFDDHMYFPVPMNMDLDRYLKSESLS FYVFD 
DSDrQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

dill J. (tvP dClQ 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HsHistidine, I^lsoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R^Arginine, 
S=Serine, T»Threonine, V« Valine, 
W«Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 
LVIiAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQEGSVDEVKEN 
TE KMQQG KDDVSLLS EG QLAEQS LAS S EDETE I TEDLEPEVEED 
MSASDSDDCIIPGPISKNIKQPSEKXRIEIIALSLNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIYVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAHVDLADMFQEGRDLIEQNIDVFDARADGEGIGKLRVTVEAL 
HALQS VYKQYRDDLEA 


6346 


2321 


533 


ODRRLLRLELQKTCQPTSTMSGSHTPACGPPSALTPSIMPQElt, 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLMB 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GI PHGMRPQLWMRLSGALQKKRNSELS YRE IVKNSSNDET I AAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASW 
DIKLLLR1WDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSDI PSQMEDAELLLGVAMRLAGS LTDVAVETQRRKHLAYL 
IADC^QLLGAGTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAI LRVARHFQCTD PKNCS WSRQLPG LL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFBRHDDDELGFRKNDI ITI VSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRSPGWVQIKC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6347 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPS IWPQE I L 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLE FTHNHDVGDLTWDKIAVS LPRS EKLRSLVLA 
G I PHGMRPQLWMRLSGALQKKRNSELS YREI VKNSSNDETI AAK 
Q I EKDL LRTMPS MAC FASMGS IG VPRLRR VLRALAWL Y PE IG YC 
QGTGMVAACLLLFLEEEDAFWMM5AIIEDLLPASYFSTTLLGVQ 
TDQR VLRHL I VQYL PRLDKLLQEHD I ELS L I TLHWFLTAFAS W 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSDI PSQMEDAELLLGVAMRLAGS LTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAI LRVARHFQ CTDP KNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWSFIiRSPGWVQIKC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6348 


3 


3679 


^.^rtcxv^r v xuuiv^r JjAAyuWAi^jbbCxUJLli^MJjKWEL 

KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 

ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 

EKKQQ FRNLKE KCFLTQLACFLANQQNK YKYE ECKDL I K FMLRN 

ERQ FKE B KLAEQLKQAEELRQY KVL VHSQE RELTQLREKLREGR 

DASRSUJEHLQ^LLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 

LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NeAsparagine , 
P=»Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, VoValine, 
W^Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEJDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEBEBVPQESWDEG 
YSTLS I PPEMLASYKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLS RELLD EKGPBVLQDSLDRC YS TP SG CLE LTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQBVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEWEPEVLQD 

sldrcystpsscleqpdscqpygssfyalbekhvgfsldvgeie 
kkgkgkkrrgrrskkerrrgrkegeedqnppcprlsrelldekg 
pevlqdsldrcystpsgcleltdscqpyrsafyileqqrvglav 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKBPEVLQESIiDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRIiSRELLDBKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL ' 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYAIiEEKH 
VGFS LDVGB I E KKGKGKKRRGRRS KKERRRGRKEGEEDQN P PCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEEEHISFALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6349 


3 

* 


3*79 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLlKSMIiRNELQFKEB 

KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EIILQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVOFCLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHG P CDSNQPH KNI K I T FE EDE VNS TL WDRES S H DE CQDALN 

ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 

EKKQQFRNLKEKCFLTQIACFLANQQNXYKYEECKDLIKFMLRN 

ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 

DASRSLNEHLQALLTPDEPDKSQGQDLQEQIiAEGCRLAQHLVQK 

LS PENDNDDDBD VQ VEVAE KVQKS S APREMPKAEE KE VPEDSLE 

ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVH 1 1 PENES DDEEEEEKGPVS PRNLQESEEEE VPQES WDEG 

YSTLS I P PEMLAS YKS YSS T FHSLEEQQ VCMAVD I GRHR WDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRBLLEWEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALBEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLBLTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDBIEKYQEVEE 

DQDPSCPRLSRELLDEKEPBVLQDSLGRCYSTPSGYLELPDLGQ 

P YSS AVYS LEEQ YLGLALD VDR I KKDQE EEEDQGP P C PRLSREL 

LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 

S FEEEHIS FALYVDNRFFTLTV7SLHLVFQMGVI FPQ 


6350 


3 


3679 


KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDD3DVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQ FRNL KE KC FLTQLAC F LANQQNKYKYE ECKDLI KFMLRN 
ERQ FKEE KLAE QLKQAE ELRQ Y KVLVHS QE RELTQLRE KLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LS PENDNDDDEDVQVEVAE KVQKS SAPREMPKAEEKE VPEDSLE 
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SEQ 
ID 
NO: 


rl CU1.ULCU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
{A=Alanine, C=Cy3teine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lyeine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKlTFEEDKVDSTLIGSSSHVElH 

EDAVHI IPENESDDEEEEEKGPVSPRNLQESEEBEVPQESWDEG 

YSTLS I PPEMLAS YKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQ P YRS AF YVLEQQR VGLAVNMDE I EKYQE VEEDQD P SCPRLS R 

ELLDEKBPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

Q YLGLALDVDRI KKDQEEEEDQGPPCPRLSRELLEWEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKBGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGBLLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 

LE WE PE VLQDSLDRC YSTPS S CLEQPDS CQPYGS S F YALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 

SFEEEHISFALYVDNRFFTLTVTSLHLVFQMGVIFPQ | 


6351 
6352 


1291 


319 


KhARRRTERSQLGRMLWEVANGRSLVWGAEAVQALRERLGVGGn 
RTVGALPRGPRQNSRLGLPLLLMPEEARLLAEIGAVTLVSAPRP 
DSRHHSLALTSFKRQQEESFQEQSALAAEARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKEDETSDGQASGEQEEA 
G PS SSQAGP SNGVAPL PRSALLVQLATARPR P VKAR PLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWBRGFFLSAAGKFGGDFLVYPGD 
PLRFHAHYIAQCWAPEDTI PLQDLVAAGRLGTSVRKTLLLCS PO 
PDGKWYTSLQWASLQ 




23$ 


923 


WSEWLSPCHAAKCKGLSMLRITMKTRAI3LAADATEFVQGRSAP 
AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 
LMGNMNPEGGVNHENGMJTRDGGMI PEGGGGNQE PRQQPQP PPEE 
PAQAAMEGPQPENMQPRTRRTKFTLLQVEELES VFRHTQY PDVP 

TRREIiAENLGVTEDKVRVWFKNKRARCRRHQRELMLANELRADP 
DDCVYIWD 


6353 






K^AGAIPli/u^P^byQAAEEEKEMDLPDSASR^CGRlLWl 
VNTDDVNAIILAQKNMLDRFEKTNEMLLNFNNLSSARLQQMSER 
FLHHTRTLVEMKRDLDSIFRRIRTLKGKLARQHPEAFSHIPEAS 
FLEEEDEDPIPPSTTTTIATSEQSTGSCDTSPDTVSPSIiSPGFE 
DLSHVQPGSPAINGRSQTDDEEMTGE ! 


6354 


965 


510 


^SiiRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNQEVFcH 

HP VTDQS LI VELLELQAHVRGEAAARYHFEDVGG VQGARAVHVE 

SVQPLSLENLALRGRCQEAWVLSGKQQIAKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPp | 


" 6356 ■ 


158 


1642 


k^ssaafrgsui^gamirrvlphgmgrglltrrpgtrrggf^ldH 

WDGKVS E I KKK I KS I LPGRS CDLLQDTS HLPP EHS DWI VGGG V 
liGLSVAYWLKKLESRRGAIRVLWERDHTYSQASTGLSVGGICQ 
QFSLPENIQLSLFSASFLRNINEYLAWDAPPLDLRFNPSGYLL 
LASEKDAAAMESNVKVQRQEGAKVSLMSPDQLRNKFPWINTEGV 
ALASYGMEDEGWFDPWCLLQGLRRKVQSLGVLFCOGEVTP cvc c 

SQRMLTTDDKAWLKRIHEVHVKMDRSLEYQPVECArviNAAGA 
WS AQ IAALAGVGEG P PG TLQGTKL P VE PRKR YVYVWHCPQG PG L 
ETPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVWPHLALR VPAFETLKVQS AWAG YYD YNT FDQNG WG PH 
PL WNM Y FATGFSGHG LQQAPGI GRAVAEM VLKGR FQT I DLS P F 
LFTRFYLG E K I QENN 1 1 




354 


633 


TGLTSSCLPLOVMMTKRTKDMGKFSSVTVSTIDEEEEEIEAREvH 
ADS YAQNAKV I EKQLERKGMS KRRLQELABLEAKKAKMKGTLID 
NQFK J 
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SEQ 
ID 
NO: 


Predi c ted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
IA=Alanine, C-Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H-Hi3tidine, I=Isoleucine, K=Lysine, 
L=Leucine, K=Methionine, NoAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

S=Serine. TaThreoninp V-Va 
"^Tryptophan, Y=Tyrasine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVR VLRNQTS I SQWVP VCSR L I P VS PTQGQGDRALS 
RTSQWPQMSQ2QACX3GSEQIPGIDIQLNRKyHTTRKLSTTKDSP 
QPVEEKVGAFTKI I EAMGFTGPLKYSKWKIKI AALRMYTS CVEK 
TDFE E F FLRCQM PDTFNS W FL I TLLHVWMCLVRMKQEGRS GKYM 
CRT TVHFMWKnvnopnox/Mrs^/TJDvrT.vtfWKsYT mtxtuciva Tr — v 

v-xvx x vnrinnDUvuUKUKVnuVWr X Xlj^ZUMWli^MTNHFyAAILGY 

DEG I LS DDHGLAAAL WRTFFNR KCEDPRHLELLVEYVRKQ IQ YL 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1040 


ASDALHSLSAPVLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
AI SKTAVAPIERVKLLLQVQHASKQIAADKQ YKGIVDCI VRI PK 
EQGVLSFWRGNLANVIRYFPTQALNFAFKDKYKQ1 FI.GGVDKHT 
QFWRYFAGNIiASGGAAGATSIiCFVYPLDFARTRLAADVGKSGTE 
RE FRG LGD C LVK I TK SD G I RGL YQG FS VS VQG III YRAAYFGV Y 
ux*\A.\3nuifuifRM l Hi v vanflLJvj I VTAVAGVVSYPFDTVRRRMM 

MQSGRXGADIMYTGTVDCWRKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKVI 


6359 


98 


1086 


v\-Kycc.nr..pnu5uu"ijfc'£>ailV kMSUSKS IQKSELLGIiLKTYNCYHE 
GKS FQLRHREEEGTL 1 1 EGLLNI AWGLRRPI RLQMQDDREQVHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASavJSQRRPKCRAPGEAQRIRRHRFS 
iwijHr iJNtiK. Jo Vr rPAYGSVTNVRVNSTMTTI^VLTLLLNKFRV 
EDG PS EFALYI VHESGERTKLKDCEYPLISR I LHGPCEKI AR I F 
LMEADLG VEVPHEVAQYI KFEMP VLDSFVEKLKEEEERE 1 1 KLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEEWLPPRSCRVFWIHSGTTMSKVSFKITLTSDP ' " 
RLPYKVIiSVPESTPFTAVIjKFAAEEFKVPAATSAIITNLXSIGIN 
PAQTAGNVFLKHGSELRI IPRDRVGSC 


6361 


615 


X58 


RPGLGQLQHCAIAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ ' 
FKDTLNTPL PD S S P VAVPLGAP I AVASTLS VEHNDGVETG I WAC 
APGRWRRQITSQEFCHFIQGRCTFTPDDGETLHIQAGDALMLPA 


6362 


350 


1576 


TTKDGSHSAALKLQQLPPTSSSSAVSEASFSYKENIilGALLAIF 
GHLWS IALKLQKYCHI RIiAGSKDPRAYFKTKTWWLGLFLMLLG 
EI^VFASYAFAPLSLIVPLSAVSVIASAIIGIIFIXEKWKPKDF 
LRRYVLS FVGCXJLAWGTYIiV^FAPNSHEKMTGENVTRHLVS W 
PFLL YM LVE 1 1 LFCL LL YF YXEKNANN I W I LLLVALLGS MTW 
TVKAVAGMLVLS I QGNLQLD Y P IF YVM FVCMVATAVYQAAFLSQ 
ASQMYDSSLIASVGYIIjSTTIAITAGAIFYLDPIGEDVLHICMF 
ALGCLI AFLGVFL ITRNRKKPI PFEP Y ISMDAMPGMQNMHDKGM 

VPYRVLEHTKKE 


6363 


21 


1201 


RRTRt,GS S FPRRRDSSAMES YDVIANQP VVIDNGSGVIKAGFAG 
DQIPKYCFPIWGRPKIWRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAE VFFETFNVPAL FI S MQ AVLSLYATGRTTGVVLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRKEGY 
DFHSSSEFEIVKAI KERACYLS INPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
RTLFSN I VLSGGSTL FKG FGDRLLSE VKKLAPKDVKIRI S APQE 
RLYST W I GGSI LAS LDTFKKM WVS KKE YEEDGARS IHR KTF 




21 


1201 


RRTRLGS S FPRRRDS SAME 3 YDVI ANQPWI DNGSGVI KAGFAG 
DQIPKYCFP1NWGRPKHVRVMAGAXBGDIFIGPKAEEHRGLLSI 
RYPMEHGIV1XDWNDMERIWQY^SK1X?LQTFSEEHPVLLTEAPL 
NPRKNRERAABVFFETFNVPALFISMQAVLSLYATGRTTGVVLD 
SGDGVTHAVPI YEG FAMPHS IMRIDIAGRDVSRFLRLYLRKEG Y 
DFHSSSEFEIVKAI KERACYLS INPQKDETLETEKAQYYLPDGS 
TISIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A« Alanine, OCysteine, D=Aspartic Acid, E=« 
Clutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PoProline, QsGlutamine, R=Arginine, 
S=Serine, T«Threonine, VaValine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTLFSN IVLSGGSTL FKGFGDRLLS E VKKLAP KDVXI R I S APQE 
RLYSTW IGGS ILASLDTFKKMWVS KKEYEEDGARS IHRKTF 


6365 


234 


1939 


KHKS RAS CAARAQAFG PSRER E VHSR FRSGLRRLGESNS GCCTM 
ASMGTLAFDE YGRPFL I IKDQDRKSRLMGLEALKSHIKAAKAVA 
NTMRTSLGPNGLDKKMVDKDGDVTVTNDGATILSMMDVDHQIAK 
LM VELS KSQDDE I GDGTTGWVLAGALLEEAEQLLDRG I HP I RI 
ADG YEQAAR VAI EHLDKI SDS VLVDI KDTE PL I QTAKTTLGS KV 
VKSCHRQMAE IAVNAVLTVADMERRD VD FEL I KVEGKVGGR LED 
TKLIKGVIVDKDFSHPQMPKKVBDAKIAILTCPFEPPKPKTKHK 
LDVTS VEDYKALQKYKKEKFEEM TQQI KETGANLAIOQWGFDDE 
ANHLLLQNNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
G FAG LVQE I S FGTTKDKMLVI EQC KNS RAVT I F I RGGNKM 1 1 E B 
AKR S LHDALC V I RNL I RDNR WYGGGAAE IS CALAVSQEAD KC P 
TLEQYAMRAFADALEVI PMALSENSGMNPIQTMTE VRARQVKEM 
NPAI^3IDCLHKGTNDMKQQHVIETI*IGKKQQISLATQMVRMILK 
IDD1RKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFW VLLS 1 FLGAVAMLCKEQGi^VLGLNAVFDILV 
IGKFNVLE I VQ K VLHKD KS LENLGMLRNGGLLFRMTLLTS GG AG 
MLYVRWRIMG TG P PAFTEVDNPAS FADSMLVRAVNYNYYYSLNA 
WLLLCPWWLC FDWSMGCIPL I KS I SDWRVI ALAALWFCLI GLIC 
QALCSEDGHKRR I LTLGLGFLVIPFLPASNLFFRVGFWAERVL 
YLP s vg YC VLLT FG FOALS KHTKKKKL I AAWLG I LFI NTLRC V 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKY VHAMNNLGNILKERNELQEAEELLS LAVQ IQ 
PDFAAAWMNLGIVQNSLKRFEAAEQSYRTAIKHRRKYPDCYYNL 
GRLYADLNRHVDALNAWRNAWLKPEHSLAWNNMI I LLDNTGNL 
AQAEAVGREALELIPNDHSLMFSLANVLGKSQKYKESEALFLKA 
IKANPNAASYHGNLAVLYHRWGHLDLAKKHYEISLQLDPTASGT 
KENYGLLRRKLELMQKKAV 


6367 


287 


1934 


SIGFPVMLVLSILLYTCEMFQDSVAFEDVAVSFTQEEWALLDPS 
QKNLYRDVMQETFKNLTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCESKESHHCGESFNQIADDMLNRKTLPGITPCESSVCGEVGT 
GHSSLNTHI RADTGHKSS EYQE YGENPYRNKECKKAFS YLDSFQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAF Y FLNLCL I HERIHTG VKPYKC KQCGXAFTR S TTLPVHER 
THTGVNADECKECGNAFSFPSEIRRHKRSHTGEKPYECKQCGKV 
PISFSSIQYHKMTHTGEKPYECKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTKTEDKPYGCKQCGKGFRCA 
SQLQIHERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
E CKQCGKAFR YFS SLH I HERTHTGDKP YE CK VCG KAFTCS SS I R 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RPVPAKLN P RS WPRTAGALPLRP PPLTMAVFHDE VE I EDFQ YDE " 
DS STY FY P CP OGDNFS I TKEDLENGEDVATCPSCS L 1 1 KVI YDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AliUUKUTKlf'F TfKUi'yb LAJHNf'UKS AAUT VTRTIHGS PREDTGT 

prsremmfqdsvafedvavsftqebwalldpsqknlyrdvmqet 
fknltsvgk™cvqniedeyknprrnlslmreklcbskeshhcg 
es fn q i add m lnr ktl pgzt pces s vcgevgtghss lnthi rad 
tghksseyqeygenpyrnkeckkafsyldsfqshdkactkekpy 

DGKECTETFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 
IHERIHTGVKPYKCKQCX5KAFTRSTTLPVHERTHTGVNADECKE 
CGNAFSFPSEIRRHKRSHTGEKPYECKQCGKVFISFSSIQYHKM 
THTGEKPYECKQCGKAFRCGSHLQKHGRTHTGEKPYECRQCGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 

• 


Amino acid segment containing signal peptide 
(A=:Alanine # C=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid P-Ph^nvl al an-ino f2_ra1 4 

H=Histidine, I«Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R»Arginine, 
S=Serine, T«Threonine, V=Valine, 
W«Tryptophan, Y-Tyrosine, X«Unknown, *-3top 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSLHIHERTHTGDKPYECKVCaKAPTCSSSifeYH^feTHTGBKPY 
E CKHCG KA F I S N YI RYHE RTHTGE KP YQCKQCGKAFI RAS S CR E 
HERTHTINR 


6370 


1711 


329 


s v rKo lAjiJUftvjAAAAQjARTAG AGLLRLLLGCG 
ALVGGLRPVThFTTPANAQNASKTWELSLYELHRTPQEArMDGTE 

I AVS PRSLHSELMCP ICLDKIiKNTMTTKECLHRFCSDCI VTALR 

SGNXK(~!PTCJ? VKT.VQ W CT.PDnDMPnaT.Te^rvDPDDPvr* unn 
•jwiiiuj^r i ^-jmvxvu voiUvoljKJrJJrIVrL//iijXisK.l i foKKcil: &AHQD 

RVLI RLSRLHNQQALSSS IEEGLRMQAMHRAQRVRR PI PGSDQT 
TTM S GGEG E PG EG EGDGE DVS 5 DS APDS APG PAP KR PRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
! LALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPE2PALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


268 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSFKEK^MKC 
LHNNNFENALCRKESKEYLECRMERKLMLQEPLEKLGFGDLTSG 
KSEAXK 


*372 


2141 


625 


RVSAIASEGKAEERYKKLEDLI^KSFSLVKMPSLQPVVMCVMKH™ 
u f K V P fc. KKIiKIjVMAD KE LYRACAVE VRRQ I WQDNQALFGDEVS P 
LLKQYILEKESALFSTELSVLHNFFSPSPKTRRQGEWQRLTRM 
VGKNVKLYDMVLQFLRTLFLRTRNVHYCTLRAELLMSLHDLDVG 
EI CTVDPCH KFTWCLDAC I RER F VDS KRARB LQG FLDGVXKGQE 
QVLGDLSMILCDPFAINTLALSTVRHLQELVGQETLPRDSPDLL 
uij jjxtuij^tutayiji.ft.wL'K l ui* U V fc Kb P KM EVEL I TRFLPMLMSFLVD 
DYTFNVDQKLPAEE KAPVS YPNTLPES FTKFLQEQRMACEVGLY 
YVLHITKQRNKNALLRLLPGliVETFGDLAFGDI FLHLLTGNtiAL 
LADEFALEDFCS SLFDGFFLTAS PRKENVHRHALRLLIHLHPRV 
APSKLEALQKALEPTGQSGEAVKELYSQU3EKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PSRAARAS PARLPAMVS W IIS RLWLI FGTLYPAYY2YKAVKS K 
DI KEYVKWMMYWI I FALFTTAETFTDI FLCWFPFYYELKIAFVA 
WLLSPYTKGSSLLYRKFVHPTLSSKEKEIDDCIiVQAKDRSYDAL 
VHFGKRGLNVAATAAVMAASKGQGALSERLRSFSMQDLTTIRGD 
GAPAPSGPPPPGSGRASGKHGQPKMSRSASESASSSGTA 


6374 


535 


210S 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
*~ir lax s\mi iSblltljbolKUHSuPTHTSCNYTSSTIFLSSTRD 
HS CPTHTSCNYTSS TI FLS S TRDHS CPTHTFCNYPR P I IRLS S C 
CPAELQTEGSNGKKEVLSGFQWLEDTVLPPEGGGQPDDRGTIN 
DISVLRVTRRGEQADHFTQTPLDPGSQVLVRVDWERRFDHMQQH 
SGQHLITAVADHLFKLKTTS WELGRFRSAI ELDTPSMTAEQVAA 
IEQSVNEKIRDRLPVNVREIjSLDDPEVEQVSGRGLPDDHAGPIR 
WNIEGVDSNMCCGTHVSNLSDLQVIKILGTEKGKKNRTNLIFL 
SGNRVLKWMERSHGTEKALTALLKCGAEDHVEAVKKIiQNSTKIL 
QKNNLNLLRDLAVH I AHSLRNS PDWGGWI LHRKEGDSEFMN 1 1 
ANEIGS EETLLFLTVGDEKGGGLFLIiAGPPAS VBTLGPRVAEVL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKS 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRPRRPGQQWRGPTMLVTAYLAFVGLLASCLGLELSRCRAK 
PPGRACSNPSFLRFGLDFYQVYFIoALAADWLQAPYIiYKLYQHYY 
FLEGQIAILYVCGLASTVLPGLV ASS LVDWLGRKNSCVLFSLTY 
SLCCLTKLSQDYFVLLVGRALGGI/STALLFSAFEAWYIHEHVBR 
HDFPAEWI PATFARAAFWNHVLAWAGVAAEAVAS WIGLGPVAP 
FVAAI PLLALAGALALRNWGEN YDRQRAFSRTCAGGLRCLLS DR 
R VLLLGT I QALFES V I FI FVFLWT P VLD PHGAPLGI I FS S FMAA 
SLLGSSLYRIATSKRYHLQPMHLLSLAVLIWFSLFMLTFSTSP 
GQES P VES F r AFLL I ELACGL YFPSMS FLRRKVI PETEQ AGVLN 
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ID 

NO: 


beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opoasible nucleotide insertion) 








W FRVPLHS LACLGLLVLHDS DRKTGTRNM FS I CS AVM VMALLAV 
VGLFTWRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


isstdidhyrfsflvnskmpskeSWsgrktnraavhkskqegrq 

QDLLIAALGMKLGSPKSSVTIWQPLKLFAYSQLTSLVRRATLKB 
NEQIPKYEKIHNPKVHTFRGPHWCEYCANFMWGLIAQGVKCADC 
GLNVHKQCSKMVPNDCKPDLKHVKKVYSCDLTTLVKAHTTKRPM 
WDMCIREIESRGLNSEGLYRVSGFSDLIEDVKMAFDRDGBKAD 
IS VNM YEDINI I TGALKLYFRDLPI PL I T YDAYPKFI ESAKIMD 
PDEQLETLHEALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLGIVFGPTLMRSPELDAMAALNDIRYQRLWELLIKNEDILF 


6377 


2311 


1845 


SRIRRRSSRRPREPPGPSRRRRRRRPOPRTMPSEKTFKQRRTFE " 
QRVE DVRLIREQHPTKI PVI I ERYKGE KQLPVLDKTKFLVPDHV 
NMS E L I K 1 1 RRRLQLNANQAF FLLVNGHSMVS VSTP I S EVYESB 
KDEDGFLYMVYASQETFGMKLSV 


6378 


606 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAV'A 
DLALIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAEYHADIYDKVSGDMQKQGCDCECLGGGRISHQSQDKKIHVYG 
YSMAYGPAQHAISTEKI KAKYPDYEVTWANDGY 


6379 


35 


378 


KRAG S PS PS RAALR R CAPQRSQAPRWPDRAACRRS FQGS QGRAY 
LFNSWNVGCGPAEERVLLTGLHAVADIYCENCKTTLGWKYEHA 
FESSQKYKEGKYIIELAHMIKDNGWD 


6380 


1414 


462 


PAVQGQRGAGPP*iXJRGSGNMARFALT\ATRHGETliFKKEKIIQGOr" 
GVDE PLSETGFKQAAAAGIFLNlfVKFTHAFSSDLMRTKQTMHGI 
LERSKFCKDMTVKYDSRLRERKYGVVEGKALSELRAMAKAAREE 
CPVFTP PGGE TLDQVKMRG ID FFE FLCQL I LKEADQKEQ FSQGS 
PSNCLETSLAE I FPLGKNHSSKVNSDSGI PGLAAS VLWSHGAY 
MRS L FD YFLTDLKCSLPATLS RS ELMS VTPNTGMSL FI I NFEEG 
REVKPTVQCICMNLQDHLNGLTENSLGLNLPSKSNHFEPLKGVP 
LALFTSLLC 


6381 


1668 


218 


AWRAQGSRGFSGAGWRPRQAAAMNFSEVFkLSSLLCKFSPDGK 
YLAS C VQYRL WRD VNTLQ I LQLYTCLDQ IQH I E WS ADSLF I LC 
AMYKRGLVQVWSLEQPEWHCKIDEGSAGLVASCWSPDGRHILNT 
TE FHLR ITVWS LCTKS VS Y I K YPKACLQGITFTRDGRYMALAE R 
RDCKD YVS I PVCSDWQLLRHFDTDTQDLTX3IEWAPNGCVLAVWD 
TCLEYKILLYSLDGRLLSTYSAYEWSLGIKSVAWSPSSQFLAVG 
S YDGKVR I LNH VTW KM I TE FGHPAAI ND P KI WYKEAEKS PQLG 
LGCLS FPPPRAGAG PL PSSESKYEIAS VP VSLQTLKPVTDRANP 
KIGIGMLAFSPDSYFLATRNDNIPNAVWVWDIQKLRLFAVLEQL 
S PVRAFQWDPQQ PRLAICTGGSRL YLWS PAGCMS VQVPGEGDFA 
VLSLCWHLSGDS MALLS KDHFCLCFLETEAWGTACRQLGGHT 


6382 


2 


1062 


FEEDEDRNLCLIAYPLKGDHGIVDIVDNSDCEPKSKLLRWTTNK 
KHHVLETEKTPKDWVRQHRKEEKMKSHXLEEEFEWLKKSEVLYY 
TVEKKGN ISSQLKHYNPWSMKCHQQQLQRMKENAKHRNQ YKFI L 
LENLTSRYEVPCVLDLKMGTRQHGDDASEEKAANQIRKCQQSTS 
AVIGVRVCGMQVYQAGSGQLMFMNKYHGRKLSVQGFKEALFQFF 

HNGRYLRRELLGPVLKKLTELKAVLERQESYRFYSSSLLVIYDG 
KERPEWLDSDAEDLEDL^FV Qanpc an a wo t r« a o ot rr-ktm m 

I DFAHTTCRL YGEDTWHEGQDAG YI FGLQS LID I VTE I SEE SG 
E 


6383 


3159 


1061 


5 PAPGRPS PHtiSQPAARAAAAPAMPSAKQRGSKGGHGAAS PSEK 
GAHPSAARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRS LASQP 
AARAAAAPAMPSAKQRGSKGGHGAAS PSEKGAHPSGGADDVAKK 
PPPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
GWamHVLEEVQQVRRSHQDFSRQREELGQGLOGVEQKVQSLQA 
T FGTFES I LRS SQHKQDLTE KAVKQGES E VS R I S E VLQKLQN E I 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, QoGlutamine, R=Arginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








LKDLSDGIHWKDARERDFTSliENTVEERLTELTKSINDNIAIF 
TEVQKRSQKEINDMKAKVASLEESEGNXQDLKALKEAVKEIQTS 
AKSREWDMEALRSTLQTMESDI YTEVRELVS LKQBQQA. FKE AAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEELRQIiKSDSHGP 
KEDGGFRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
ESLES LLS KSQEHEQRLAALQGRLEGLGS S E ADQDGLASTVRS L 
GETQLVLYGDVEELKRSVGELPSTVESLQKVQEQVHTLLSQDQA 
QAARLPPQDFLDRLSSLDNLKASVSQVEADLKMLRTAVDSliVAY 
SVKIETNENNLESAKGIiLDDLRNDLDRLFVKVEKIHEKV 


6384 


738 


1904 


I WEVPVCLTHLLHLQQANQPLP P PSSS INEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYALFFRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVSKAAAESAEAQTIRSVQQTLASTNLTSSLLLNTPLSQ 
HGT7SASPQTLQQSI.PRSIAPKPLTMRLPMNQIVTSVTIAANMP 
SN I GAP L I S S MGTTMVGS APSTQ VS PS VQTQQHQMQLQQQQQQQ 
QQQMQQMQQQQLQQHQMHGQIQQQMQQQHFQHHMQQHLQQQQQH 
LQQQINQQQLQQQLQQRIjQLQQLQHMQHQSQPSPRQHSPVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVLSQVSIF 


6385 


2 


1584 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LAQGPDAATTDELS SLGSDSEANGFAERRI DKFG FI VGS QGAEG 
ALEEVPLEVLRQRESKWLDMLNNWDKWMAKKHKKIRLRCQKCilP 

pslrgrawqylsggkvjojqqnpgkfdeldmspgdpkwldvierd 
lhrqfpfhemfvsrgghgqqdlfrvlkaytlyrpeegycqaqap 
iaavllmhmpaeqafwclvqicekylpgyysekleaiqldgeil 
fsllqkvspvahkhlsrqkidpllymtewfmcafsrtlpwssvl 
rvwdmffcegvki ifrvglvllkhalgs pekvkacqgqyetier 
lrs ls pk i mqeaflvqe we lpvterqierehiii qlrrwqet rg 
elqcrspprlhgakaildaepgprpalqpsps i r1»pldaplpgs 

KAKP KP PKQAQKE QRKQMKGRGQLEKPPAPNQAM VVAAAGDACP 
PQHVPPKDSAPKDSAPQDLAPQVSAHHRSQESLTSQESEDTYL 


6386 


819 


195 


TVCGSFYLGIMQRASRLKRELHMLATEPPPGITCWQDKDQMDDI* " 
RAQILGGANTPYEKGVFKLEVIIPERYPFEPPQIRFLTPIYHPN 
IDSAGRI CLDVLKLPPKGAWRPSLNIATVLTS IQLLMSEPNPDD 
PLMADISSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMLDNL 
PEAGDSRVHNS TQKRKAS QLVGI EKKFH PD V 


6387 


1 


662 


PG PTHASADAWADA WAQ PNMAM HNKAAP PQ I PDTRREIiAEL^KR 
KQELAETLANLERQIYAFEGSYLEDTQMYGNIIRGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGS GTES DTS PDFHNQENE PSQEDP EDLDGS VQ G VKPQKAA3 
STS SGS HHSSHKKRKNKNRHS PSGMFD YDFE 1 DLKLNKKPRADY 


63 88 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KQELAETLANLERQIYAFEGSYLBDTQMYGNIIRGWDRYLTNQK 
NS NS KNDRRNRK FKEAERL FS KS S VTS AAAVSALAG VQDQLI E K 
REPGSGTES DTS PDFHNQENE PSQEDP EDLDGS VQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6389 


1074 


497 


AEPGDRMaGHRLVLVLGDLH I PHRCNSLPAKFKKIiL VPGKIQH I 
JjIIIjJUA. rKESYDYLjKTI^DVHIVRGDFDENLNYPEQKVVTVG 
QFKIGLIHGHQVIPWGDMASLALLQRQFDVDILISGHTHKFEAF 
EHENKFY1NPGSATGAYNALETNI IPS FVLMDIQASTWTY VYQ 
LIGDDVKVERIEYKKP 


6330 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTREXiLLPNWQGSGSHG 
LTI AQRDDGVFVQEVTQNS PAARTGWKEGDQIVGATI YFDNLQ 
SGEVTQLLNTMGHHTVGLKLHRKGDRFFPSLGQTWDP 


6391 


5386" 


2897 


VRWNSKTECYLSIQTQENFPANLNEIjVNCIVISSLVTTQRKH<A 
MS LLGS RNQLARAVLNPNPMDFC TKDLLTTTS ER I IAYLRDFNE 
DQKKAI ETAYAMVKH5 PSVAKICLIHGPPGTGKSKTI VGLLYRL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricQlCLcu cilQ 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
<A=Alanine, C^Cysteine, D«Aspartic Acid, 
Glutamic Acid, Phenylalanine, G*Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LTENQRKGHSDENSNAKI KQNRVLVCAPSNAAVDE LMKKI ILE F 
KE KCKD KKNPLGNCGDINL VRK3 PEKS I NSE VLK FS I>DSQ VNH R 
MKKELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DENISKVSKERQELASKIKEVQGRPQKTQSIIILESHIICCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRCN 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
M I S RLP I LQLTVQ Y RMHPD IC LFPSN YV YNRNLKTNRQTEA I RC 
SSDWPFQPYLVFDVGDGSERRDNDSYINVQEIKXVMEIIKLIKD 
KR KDVS FRN IGI I TH Y KAQKTMI Q KDLDKE FDRKG P AE VDTVD A 
FQGRQKDCVIVTCVRANSIQGSIGFLASLQRLNVTITRAKYSLF 
ILGHLRTLMENQHWNQLIQDAQKRGA I I KTCDKNYRHDAVKI LK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDS KE I TLTVTS KDPERP P VHDQLQD P RLLKRMG I E VKGG 

I FLWDPQPSS PQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LSSHKPPVRGBPPAASPEASTCQSKCDDPEEELCHRREARAFSE 
GBQEKCG S ETHHTRRNSRWDKRTL EQEDSSS KKRKLL 


6392 
" 6393 


972 


186 


GRTG VDLAS SMAHRLQ I RLLT WD VKDTLLJRLRHPLG EAYATKAR 
AHGbEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 
VLQTFHbAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 
ECRTRGLRLAVISKFDRRLEGILGCLGLREHFDFVLTSEAAGWP 
KPDPRIFQEALRLAHMEPWAAHVGDNYLCDYQGPRAVGMHSFL 
WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGL 




2 017 


730 


TGGSKMAAVATCGSVAASTGSAVATASKSNVTS FQRRGPRAS VT 
NDSGPRLVSIAGTRPSVRNGQLLVSTGLPALDQLLGGGLAVGTV 
LLIEEDKYNIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILQ 
ELPAPLLDDKCJCKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 
IGP VS S SRFGH Y YDASKRM PQELI EASNWHG FFLPEKI S3 TLKV 

EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGSPLWGDDICCAENGGNSHSLTKFLYVLRGIiLRTSLSACIITK 
PTHLIQNKAI IARVTTLSDVWGLES FIGS ERE TNPLYKDYHGI* 
IHIRQiPRLNNLrCDESDVKDLAFKLKRKLFTIERLHLPPDLSD 
TVSRSSKMDLAESAKRLGPGCGMMAGGKKHLDF 


6394 


1418 


511 


gaaaggegarrrpaamatvMaA¥aaera\^^eFrwllhdevha 

VLKQLQD I LKEASLRFTL PGSGTEGPAKQEN FI LGS CGTD Q VKG 

vltlqgdalsqadvnlkmprnnqllhfafredkqwklqqiqdar 

NHVSQAIYLLTSRDQSYQFKTGAEVLKIiMDAVMLQLTRARNRLT 
TPATLTLPEIAASGLTRMFAPAIiPSDLLVNVYINLNKLCLTVYQ 

lhalqpnstknfrpaggavlhspgamfewgsqrlevshvhkvec 
vipwlndalvyftvsi,qlcqqlkdkisvfssywsyrpf 


6395 
" d39£ 


13 


658 


PSGRPTRPLCCAARRGAARHGGSVSGWPAGRTPTETSNPGSSVM 
ESVTFEDVAVEFIQEWALLDSARRSLCKYRMLDQCRTLASRGTP 
PCKPSCVSQLGQRAEPKATERGILRATGVAWESQLKPEELPSMQ 
DLLEEASSRDMQMGPGLFLRMQLVPSIEERETPLTREDRPALQE 
PPWS LG CTGLKAAMQ I QR WI PVPTLGHRNPW VARDSGE 




1 


1951 


ANILSS PSKRGQKGTLIGYS PEGTPLYNFMGDAFQHSSQS I PRF 
IKESLKQILEESDSRQIFYFLCLNLLFTFVELFYGVLTNSLGLI 
SDG FHMLFDCSALVMGLFAALMSRWKATRI FS YG YGRI EILSGP 
INGLFLIVIAFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
IGICAFSHAKSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMNANMRGVFLHVLADTLGSIGVIVSTVIilEQFGWFIAD 
PLCSIjFIArLIFLSWPLIKDACQVIiLLRLPPEYEKELHIALEiC 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV 
TG I L KDAG VNNLT I QVEKEAYFQHMSGLS TGFHDVLAMTKQME S 
MKYCKDGTYIM 


6397 


391 


122 


GAGGVGRFEAIRAPARMIEWCNDRLGKKVRVKCNTDDTIGDLK " 
Kh I AAQTGTRWNK I VLKKW YT I FKDHVS LGD YE I HDGMNLELY Y 
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ID 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing siqnal peptide 
(A=Alanine, C»Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, j 
HaHistidine, I=Isoleucine, K=«Lysine, 
LaLeucine, MaKethionine, N»Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, j 
W=Tryptophan, Y«=Tyrosine, X=Unknown, *«Stop | 
Codon, /-possible nucleotide deletion, 

\oDOfiBlbl(* mini a/>h 'i /4n 4 ** » <%. wW £ \ 

» i'vooiwc nuLicy i» xac insertion) p 

— Q — — — 


" 6398 


353 


1306 


HKQMGPLINRCKKILLPTTVPPATMRIWLLGGLLPFtLLLSGLQ 
RPTEG S E VA I K I D PDFA PGS FDDQ YQG CSKQVMEKLTQGD YFTK 
DIEAQKNYPRMWQKAHLAWLNQGK\^PQNMTTTHAVAILFYTLN 
SNVHSDFTRAMAS VARTPQQYERS FHFKYLHYYLTS AI Q/LLRKD 
SIMENGTLCYEVHYRTKDVHFNAYTGATIRPGQFLSTSIiLKEEA 
QEFGNQTLFTIFTCLGAPVQYFSLKKEVLIPPYELFKVINMSYH 
PRGDWLQLRSTGNLSTYNCQLLKASSKKCIPDP IAIASLS FLTS 
V1IFSK5RV | 


6399 " 


75 


1245 


PNLETYFGRRGEKDSMNFTPTHTPVCRKRTVVSKRGVAVSG^TK I 
RRGMAD5 LESTPLPS PEDRLAKLHPS KELLE Y YQKKMAECE AEN 
EDLLKKLELYKEACEGQHKLECDLQQREEEIAELQKALSDMQVC 
LFQEREHVLRLYSENDRLRIRELEDKKKIQNLLALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYCRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RRIHLEEIQVQHQRNQNKIKELTKNLHHTQELLYE5TKDFLQLR 
SENQNKEKSWMLEKDNLMSKI KQYRVQCKKKEDKIGKVLPVMHE 
SHHAQSEYI KVMSLCRNEWYFSGRVEGIPKNLQFVM | 


6400 


2520 


1053 


KTMKCDEWYKVQSAII^HNCXSYAMKTGKFFHNLMERKDFEfwtH 
DNISVTFLSLTDLQKNETLDIILISLSGAVQLRHLSNNLETLLKR 
DFLKLLPLBLSFYLLKWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGWQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFETS 
SLIGHS ARVYALYYKDGLLCTGSDDLSAKLWDVSTGQC VYGI QT 
HTCAAVKFDEQKLVTGSFDNTVACWEWSSGARTQHFRGHTGAVF 
SVDYNDELDILVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLQKCKVKS LLHS PGD YI LL S AD K YE I KI WP IGRE I N CKCLKTL 
S VSEDRS I CLQPRLHFDGKY I VCS S ALGL YQ WD FAS YD ILRV I K 
TPEI ANLALLG FGD I FALLFDNR YLY I MDLRTESL I SRWPLPE Y 
RKSKRGSS FIiAGEASWLNGLDGHNDTGLVFATSMPDHS IHLVLW 


6401 


109 


765 


pgaawsrpdlrgcctgpqpalrmlvlpspcpOpiafssvetMegj 

PPRRTCRSPEPGPSSSIGSPQASSPPRPNHYLLIDTQGVPYTVL 

vdeesqr e pgasg apgq kkc ys CP vcsrvfe yms ylqrhs iths 
evkpfecdicgkafkrashlarhhsihlagggrphgcplcprrf 
rdagelaqhsrvhsgerpfqcphcprrfmeqntlqkhtrwkhp 1 


6402 
6403 " 


1196 


279 


ttso^ggirqssaipvasmefaaiclrnallllpeeqqdpkqenH 

gaknsnqlggntessessetgsskshdgdkfipappssplrkqe 

lenlkcsiiacsay^aialgdnlmalnhadkllqqpklsgslkf 

lghlyaaealisldrisdaithlnpenvtdvslgissneqdqgs 

dkgeweamessgkrapo^ypssvnsartvmlfnlgsayclrsey 

dkarkclhqaasmihpkevppeaillavylelqngntolalqii 

krnqllpavkthsevrkkpvfqpvhpiqpiqmpafttvqrk 


6404 


2 

1012 


1^90 
222 


RG I HTS VLQGN LQNQM YSHNW IMNLNNLNLTQ VQQRNL I TNLQ 
RSVDDTSQAIQRIKNDFQNLCXJVFLQAKKDTDWLKEKVQSLQTL 
AANNSALAKAJNNDTLEDMNSQLNSFTGQMENITTISQANEQNLK 
DLQDLHKDAENRTA I KFNQLEER FQLFETD IVNIISNIS YTAHH 

LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTLANIRLDSVSLR 
MQQDLMR S RLDTE VANLS VI MEEMKLVDS KHG QLI KN FT I LQGP 

PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPffi^KNFTDKCYY 
FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 
WIGLTDSERENEWKWLDGTSPDYKNWKAGQPDNWGHGHGPGEDC 
AGLIYAGQWNDFQCEDVNNFICEKDRETVLSSAL 1 
ftAALAMAAPAPGLISVFSSSQELGAAIiAQLVAQRAACCLAGARA | 
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corresponding 

amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptIHe 
(A-Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H«Histidine, Ialsoleucine , K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=»Proline, Q=Glutamine, R=Arginine, 
S*Serine, T»Threonine, V»Valine, 
W=Tryptophan, Y^Tyrocine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








RFALGI^GGSLVSMLAREXPAAVAPAGPASLARWTLGFCDERIiV 
PFDHAESTYGLYRTHLLSRLPIPESQVITINPELPVEEAAEDYA 
K KLRQAFQGDS I P VFDL LI LG VG P DGHTCS L FPDHPLLQERE K I 
VAPISDSPKPPPQRVTLTLPVLNAARTVIFVATGBGKAAVLKRI 
LEDQE EN P LPAALVQ PHTG KLCWFLDEAAARLLTVP FEKHS PL 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAVATLRNHRPR " 
TAQRAAAQVLGSSGLFNNHGLQVQQQQQRNLSLHEYMSMELLQE 
AGVS VP KG YVAKS PDEAYAIAKKLGSKD WIXAQVLAGGRGKGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 
VLVCE R K YPRR EY Y FAI TMERS FQGPVLI GS S HGG VN I ED VAAE 
TPEAIIKEPIDIEEGI KKEQALQLAQKMG F PPNI VE S AAENM VK 
LYSLFLKYDATMIBINPMVEDSDGAVLCMDAKINFDSN3AYRQK 
KI FDLQDWTQEDERDKDAAKANLNYIGLDGNIGCLVKGAGLAMA 
TMDIIKLHGGTPANFLDVGGGATVHQVTEAFKLITSDKKVLA1L 
VNI FGGIMRCDVIAQGI VMAVKDLEIKI PVWRLQGTRVDDAKA 
LI ADSGLKI LACDDLDEAARMWKLSE I VTLAKQAHVDVKFQLP 
I 


CAnc 
o^Ub 


1036 


16"7 


HPRQMRGEDTPEAPP YSSGR YDS I KTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDSEGMDPERLKAFNMFVRLFVDENLDRM 
VPISKQPKEKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTRPTPPHLTSAMAENILAAACESETRKAAKRMRLEIYQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGESGEARALASRPAPSWV 
CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGLCLAVS QTVLAQLDALLVFPGQVAQLS CTLS PQHVT I RD YG V 
SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 
VLTISPVQPEDDADYYCSVGYGFSP 


5408 


1458 


903 


RGCITSSQAWRLFGGVTRGFNMRIEKCYFCSGPIYPGHGMMFVR 
NDCKVFRFCKSKCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
SFEFEKRRNEPIKYQRELWNKTIDAMKRVEBIKQKRQAKFIMNR 
LKKNKELQKVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


i3b 


446 


NTALANLLRCFTCDRLCGGCTAPAPPAHQGIVLQPVMPSCDPGP 
GPACLPTKTFRSYLPRCHRTYSCVHCRAHLAKHDELISKSFQGS 
HGRAYLFNSV 


6410 


65 


607 


RGGTAG CVACLG CWGQSS S PKAAF PAGS ACL PADS CP CLL FQAC " 
AI SGLFNC I TI HPLNIAAGVWMIMNAFILLLCEAPFCCQ FIEFA 
NTVAEKVDRLRSWQKAVFYCGMAWPIVISLTLTTLLGNAIAFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLEGEL 


6411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAGIAVLFKKKFGGVQELLNQQKKSGEVAVLKRDGRYIYYLITK 
KRASHKPTYENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVSAMIEEVFEATDIKITVYTL 


6412 


61 


1709 


RPVTSFSPLPGSCGGRLGTRTMLGRStREVSAALKQGQITPTEL 

OQKCLSLIKKTKFLNAYITVSEEVALKQAEESEKRYKNGQSLGD 

LDGI PI AVKDNFSTSG I ETTCASNMLKGYI PP YNATWQKLLDQ 

GALLMGKTWtinFI PAMrt Qra«5TrY2VEY2i> uramtc vo vrwn cvn 
vjf^j ji.h i«jvAAi»AAUCirru*iVjOvjo 1 JLAjrvriai' V WMrWo lofty I REKRKQN 

PHSENEDSDWLITGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 
AAHCGLVGFKPSYGLVSRHGLIPLVNSMDVPGILTRCVDDAAIV 
U3AIAGPDPPJ)STTVHEPINKPFWLPSLADVSKLCIGIPKEYLV 
PELSSEVQSLWSKAADLFESEGAKVIEVSLPHTSYSIVCYKVLC 
TSEVASNMARFDGLQYGHRCDIDVSTEAMYAATRREGFNDWRG 
RILSGNFFLLKENYENYFVKAQKVRRLIANDFVNAFNSGVDVLL 
TPTTLS EAVPYLE F I KEDNRTRSAQDD I FTQAVNMAGLPAVS I P 
VALSNQGLPIGLQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
LMDDCSAVLENEKLASVSLKQ 
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to first 
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residue of 
amino acid 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q= Glut amine, R<=Axginine, 
S=Serine, T-Threonine, V-Valine, 
N=Tryptophan, Y-Tyrosine, X^UnXnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=?ossible nucleotide insertion) 


6413 


2 


885 


HE PRCAGMAASLWMGDLE P YMDENF I S RA FATMG E T VMS VK 1 1 R 
NRLTG I PAGYCFVEFADLATAEKCLHKIWGKPLPGATPAKRPKL 
NYATYGKQPDNSPEYSLPVGDLTPDVDDGMLYEFFVKVYPSCRG 
G K WLDQTGVS KGYG FVKFTDE LEQKRALTEOQGAVG LG SKP VR 
LSVAIPKASRVKPVEYSQMYSYSYNQYYQQYQNYYAQWGYDQNT 
GS YS YS YPQYGYTQSTMQTYEEVG DDALEDPMPQLDVTEANKE P 
MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPHPQPARPSSRATPGPRSPGMATSIGV 
S FSV3DGVPEAEKNAGE PENTY I LRPVFQQRFRPS WKDCIHAV 
LKEELANAEYSPEEMPQLTKHLSENIKDKLKEMGFDRYKMWQV 
VIGEQRGEGVFMASRCFWnADTDNYTHDVFMNDSIjFCWAAFGC 
FYY 


6415 


2 


1168 


FVRQWQS SHRRACGLGCEARAGGGEEPRGRASS VAGW VGAFRAP 
FIEAAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTESSSVSEDGDSS2MDDBDCERRRMECLDEMSN 
LEKQFTDIiKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQ I RTKVAG 1 YRELCLESVKNKYECE IQASRQHCESEKLLLYD 
TVQSELEBKIRRLEEDRHSIDITSBtiWNDELQSRKKRKDPFWPD 
KKKPGWS GP Y I V YMLQDLD I LED WTTI RKAMATLG PHRVKTE P 
PVKLEKHLHSARSEEGRLYYDGEWYIRGQTICIDKKDECPTSAV 
I TT INHDE VWFKR PDGS KSKL Y I S QLQKGKYS I KHS 


6416 


410 


1519 


EIAPADLEIPACAPVLLSRATSSTMSVTGGKMAPSLTQfiiLSHL 
GLASKTAAWGTLGTLRTFLNFSVDKDAQRLLRAITGQGVDRSAI 
VDVLTNRSREQRQLISRNFQERTQQDLMKSLQAAI>SGNLERIVM 
ALLQPTAQ FDAQELRTALKASDSAVDVAIEI LATRTPPQLQECL 
AVYKHNFQVEAVDGITSETSGILQDLLLALAKGGRDSYSGIIDY 
NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELEEAVQNRFHGDAQVALLGLASVIKNTPLYFADKLHQALQ 
ETE PN YQVL IRI LIS RCETDLLS I RAEFRKKFGKSLYSS LQDAV 
KGDCQSALLALCRAEDM 


6417 


1 


845 


RGESR\n^WSELEGEAGGAGGWASSL^ARMDNRFATAFVLACVLS 
LI STI YMAAS IGTDF WYE YRS PVQENSSDLNKS I WDEFI SDEAD 
EKTYNDALFRYNGTVGLWRRCITIPKNMHWYSPPERTESFDWT 
KCVS FTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLP FVSL 
GLMCFGALIGLCACICRSLYPTIATGILHLLAGLCTLGSVSCYV < 
AG 1 ELLHQKLELPDNVSGEFGWS FCLACVSAPLQFMASALFI WA 
AHTNR JCEYTLMKAYR VA 


641B 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTS TPAGASPAAAYQADPPPPAH 
TPAPPPPPP CGG I ACHGE PAKF YG YDNLQRQ P I FTTQQEAELVQ 
YPDCKSS5GNIGEDPDHLNQSSSPSQE4FPWMRPQAAPGRRRGRQ 
TYSRFQTLELE KE FLFNP YLTRKRRI EVSHALALTERQ VKI WFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETXKEAQELEEDRAEGLTN 


6419 


1 


973 


PGRPRVRNFDLNSKSILQEFFCTRSIQIPANRSlttAMSKCPIFP 
MARSISTSGPLDKBDTGRQKLISTGSLPATLQGATDSLGLEWHL 
PSPDPVTVPYLSPLWWKELESLLENEGDHAITVADFVDHHPIV 
FWNLVWYFRRLDLPSNLPGL ILSSBHCNKYSKI PRHCMSEDS KY 
VTLIQMLWDNMKLHQDPGQPLYILWNAHTQKYPMVHLLQKSDNSF 
NfQELLKSMVKS I KMNDVYGPMSQILETLNKCPHFKRQRSLYREI 
LFLS LVALGREN I D I DAFDKE YKMA YDRLTPSQ VKS THNCDRP P 
STGVMECRKTFGEPYL 


6420 " 


207 


1187 


RKMI DKNQTCGVGQDS VP YM I CLIH I LEE WFG^EQIlEDYLNFAN 
YLLW VFT PLI LL I L P YFTI FLL YLT 1 1 FLH I Y KJR KNVLKEAY SH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKIPEDGPALIIFYH 
GAIP I DFYYFMAKIFIHKGRTCRWADHFVFKI PGFSLLLDVFC 

alhgprekcveilrsghllaispggvrealisdetynivwghrr 
gfaqvaidakvpi i pmftqniregfrslggtrlfrwlyekfryp 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine. 
H>=Histidine, I=Isoleucine, K=>Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
PsProline, Q*Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X«UnJcnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FAPMYGGFPVKLRTYLGDPIPYDPQITAEELAEKTKNA\/OAiiD 
KHQRIPGNIMSALLERFH 


6421 


■ 1B44 


362 


WALSLRR Q Pi£RK S NKLIjS PHPHSWLR S E FKMASS PAVLRASRL 
YQWSLKSSAQFLGSPQLRQVGOIIRVPARMAATLILEPAGRCCW 
DEPVR I AVRGLAPEQP VTLRAS LRDE KGALFQAHAR YRADTLGE 
LDLERAPALGGSFAGLEPMGLLWALEPEKPLVRLVKRDVRTPIA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRGT 
L FLP PE PG P FPG I VDM FGTGGGLLE YRAS LLAGKG FAVMALA Y Y 
NYEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
ELCLSMAS FLKG I TAAWINGS VANVGGTLRYKGETLPPVGVNR 
NRIKVTKDGYADIVDVLNSPLEGPDQKSFIPVERAESTFLFLVG 
QDDHNWKSEFYANBACKRUJAHGRRKPQIICYPETGHYIEPPYF 
P LCRAS LHALVGS P 1 1 WGGE PRAHAMAQ VDAWKQLQT FFHKHLG 
GREGTIPSKV 


6422 


181 

■ 


2133 


EGENLSWFQEFWGDIAKEFYWKTPCPGPFLRYNFDVT^KrFIE 
WM KG ATTN I CYNVLDRNVHEKKLGDKVAFYWEGNEPGETTQITY 
HQLLVQVCQPSNVLRKQGI HKGDRVAI YMPMI PELWAMLACAR 
IGALHSIVFAGFSSESLCERILDSSCSfcLITTDAFYRGEKLVNL 
KELADEALQKCQEKGF P VRCC I WKHLGRAE LGMGDS TSQ S PP I 
KRSCPDVQISWNQGIDLWWHELMQEAGDECEPEWCDAEDPLFIL 
Y TS GSTGKPKGVVHT VGG YML YVATTFKYVFDFHAED V FW CTAD 
I GW X TGHS YVTYGP LANGATS VL FEGI PTYPDVNRLWS I VDKYK 
VTKFYTAPTAIRLLMKFGDEPVTKHSRASLQVLGTVGEPINPEA 
WLW YHRWGAQRCP I VDTFWQTETGGHMLT PLPGATPMKPGSAT 
FPFFGVAPAILNESGEE LEGE AEG YLVFKQP WPG IMRTVYGNHE 
RFETTYFKKFPGYYVTGDGCQRDQDGYYWITGRIDDMLNVSGHL 
LS TAEVESAL VEHEAVAE AAWGHPHP VKGE CL YCFVTLCDGHT 
FS P KLTEELKKQ IREKIG P I ATPD Y IQNAPG 3j PKTRSG KI MRR V 
LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6423 


614 


1237 


ANLKEIPRDLPPfiTVLLYLDSNQITSlPNEIFKDLkQLRVLNLS 
KNGI EFIDEHAFKGVAETLQTLDLSDNRIQSVHKNAFNNLKARA 
RIANNPWHCDCTLQQVLRSMASNHETAHNVICKTSVLDEHAGRP 
FLNAANDADLCNLPKKTTDYAMLVTMFGWFTMVISYVVYYVRQN 
QEDARRHLEYLKSLPSRQKKADEPDD ISTW 


6424 


1 


1180 


KKVS WP VAAMVH CS CVL FR KYGN F I DKLRL FTRGGSGGMG YPRL 
GGEGGKGGDVVTVVAHNRMTLKQLKDRYPRKRFVAGVGANSKISA 
LKGSKGKDWB1 PVPVGIS VTDENGKI IGELNKENDRILVAQGGL 
GGKLIiTN FLPL KGQ KR 1 1 HLDLKLI ADVGLVGF PNAGKS S L LS C 
VSHAKPAI AD YAFTTLKPELGKIMYS DFKQ IS VADI, PG L I EGAH 
MNKGMGHKFLKHIERTRQLLFWDISGFQLSSHTQYRTAFETII 
LLTKELELYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNMIPERTVEFQHIIPISAVTGEGIEELKNCIRKSL 
DEQANQENDALHKKQLLNLWISDTMSSTEPPSKHAVTTSKMDII 


6425 


1850 


1144 


bAMHGGGGIPLETLKEESQSRHVLPASFEVNSLOKSNWGFLLTG'" 

LVGGTIiVAVYAVATPFVTPALRKVCLPFVPATMKQIENWKMLR 

CRRGSLVDIGSGDGRIVIAAAKKGFTAVGYELNPWLVWYSRYRA 
WREGVHGSAKFYTSDr.UWTPQnvcisnnrTi?mriinMNAT r\r r.vw T n 

RELEDDARVIACRFPFPHWTPDHVTGEGrDTVWAYDASTFRGRE 
KRPCTSMHFQLPIQA 


6426 


30 


565 


SRGAAVGGMSVAGGEIRGDTGGEDTAAPGRFSFSPEPTLEDIRR' - 
LHAEFAAERDWEQFHQPRNliLLALVGEVGELAELFQWKTDGEPG 
PQGWS PRBRAALQEELSDVLI YLVALAARCRVDLPLAVLSKMDI 
NRRRYPAHLARSSSRKYTELPHGAISEDQAVGPADIPCDSTGQT 
ST 


6427 


145 


959 


AASWQPPHVPKAGKMVSWMICRLWLVFGMLCPAYASYKAVKTK " 
MIRE YVRV7MMYWIVFALFMAAEI VTDI F I SW FPFYYEI KMAFVL 
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to first 
amino acid 
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amino acid 
sequence 


Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NcAsparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V^Valine, 
W*Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








WLLS P YTKGASLL YR KF VHPSLSRHE KE IDA YI VQAKERS YET V " 
LS FGKRGLNIAASAAVQAATKSQGALAGRLRS FSMQDLRS IS DA 
PAPA YH DP L YLE DQ VSHRR PPI G YRAGG LQDS DTED ECWS DTEA 
VPRAPAR PRE K P LI RSQS LR WKR KP P VREGTSRSLKVRTRKKT 
VPS DVDS 


" 6428 


1982 


444 


SGSGG KMEDHQHVP ID IQTS KLLDWLVDRRHCS LKWQS LVLTI R 
EKINAAIQDMPESEEIAQLLSGSYIHYFHCLRILDIjLKGTEAST 
KNI FGRYSSQRMKDWQEI IALYEKDNTYLVELSSLLVRNVNYEI 
PSLKKQIAKCQQLQQEYSRKEEECQAGAABMREQFYHSCKQYGI 
TGENVRGE LLAL VKDLPS QLAE IGAAAQQSLGE A I DVYQAS VG P 
VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSVVERPHLEEL 
PEQVAEDAIDWGDFGVEAVSEGTDSGISAEAAG I DWG I FPESDS 
KDPGGDGIDWGDDAVALQITVLEAGTQAPEGVARGPDALTLLEY 
TETRNQFLDELMELEIFLAQRAVELSEEADVLSVSQFQLAPAIL 
O^QTKEKMVTMVSVTjEDIilGKLTSLQLQHLFMlLASPRYVDRVT 
EFLQQKLKQS QLLALKKE LM VQKQQE ALE EQ AALE P KLDL LLEK 
TKELQKLI E AD I S KRYSGRP VNLMGTS L 


6423 


3413 


3442 


EPSSWTAAPRGP1AAHPLEAAVQEDDRRALSFDSRIKVFANGTL 
WKS VTDKDAGDYLCVARNKVGDD YVVLKVDVVMKPAKI EHKEE 
NDHK VFYGGDLKVDCVATGLPNPE IS WS LPDGSLVNSFMQSDDS 
GGRTKRYWFNNGTLYFNEVGMREEGDYTCFAENQVGKDEMRVR 
VKWTAPATI RNKTCLAVQ VP YGDWT VACEAKGE PMP KVTWLS 
PTNKVIPTSSEKYQIYQDGTLLIQKAQRSDSGNYTCLVRNSAGE ' 
DRKTVWIHVNVQPPKINGNPNPITTVREIAAGGSRKLIDCKAEG 
IPTPRVLWAFPEGWLPAPYYGNRITVHGNGSLDIRSLRXSDSV 
QLVCMARNEG6EARLI VQLT VLEPMEKP IFHDP I SEKITAMAGH 
TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMLH 
I S GLSS VDAG AYRC VARNAAGHTERLVS LKVGLKP EANKQ YHNL 
VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 
LDNGTLTVREASVFDRGTYVCRMETEYGPSVTSIPVIVIAYPPR 
ITSEPTPVIYTRPGNTVFCLNCMAMGIPKADITWELPDKSHLKAG 

VQARLYGNRFLHPQGSLTIQHATQRDAGFYKCMAKNILGSDSKT 
TYIHVF 


6430 


1946 


602 


RTRVSTGLRRTLLWSEAVGAS STRGDTGI PGSGEGGAGPGGGEG 
AMLEAMAEPSPEDPPPTLKPETQPPEKRRRTI EDFNKFCS FVLA 
YAGY I PPSKEESDW PASGSSS PLRGESAADS DGWDSAPSDLRTI 
QTFVKKAKSSKRRAAQAGPTQPGPPRSTFSRLQAPDSATLLEKM 
KLKDS LFDLDGPKVAS PLS PTS LTHTS R P PAALTP VPLSQGDLS 
HPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 
KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEEEBEEEEEEBEMA 
TWGGEAP VP VLPT P PE APRP PAT VH PEG VP PADSES KE VGS TE 
TSQDGDASSSEGEMRVMDEDIMVESGDDSWDLITCYCRKPFAGR 
PMI E CS LCGTW IHLS CAK I KKTNVPDFFYCQKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR~ 
LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
K t liKiiKft x V P E DEDLKKRRVPQ AK P VAVE E KVKEQLEAAKPE P V 
I EEVD LANLA P RKPD WDLKRDVAKKLEKL KKRTQRAI AELI RE R 
LKGQEDSLASAVDAATEQKTCDSD 


6432 ; 


56 


1692 


GGLGTMGSRIKQNfPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
D YSDQE VLQTLTKFCFP FYVDSLT VSQ VGQNFTF VLTDI DS KQR 
FG FCRLS SGAKSC FC I LS YLPWFE VFYKLLNI LADYTTKRQENQ 
WNELLETLHKLP I PD PG VS VHLS VH S Y FTVPDTREL PS I PENRN 
LTE YFVAVDVNNMLHL YASMLYERR I LI I CS KLSTLTACIHGSA 
AMLYPMYWQHVYIPVIiPPHXLDYCCAPMPYLIGlHLSLMEKVRN 
MALDDWI LNVDTNTLETPFDDLQSLPNDVI SSLKNRLKKVSTT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine 
S=Serine, T=Threonine, VoValine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, '-stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








tgdgvarafliovqaaffgsyrnalkiefebpitfceeafvshyr 
sgamrqflqnatqlqlfkqfidgrldllnsgegfsdvfeeeinm 
gbyagsdklyhqwlstvrkgsgailntvktkanpamktvykfdi 

AENGCAPTPEEO LPKTAP S PL VEAK n PK7X pn o-o d rm ru-onm ro 
PPRPH WKRPKSNI AVEGRRTSVPSPEQNTI ATPATLHI LQKS I 
TKFAAKFPTRGWTSSSH 


6433 


! 1^24 


484 


APVTKRKEVFAKDSKGSALDAGRDPKRPALPETLCESGWASNTA 
PTTPPQPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMLL 
LP LLLLL VATTG P VGALTDEEKRLMVE LHNL YRAQVS PTASDML 
HKRWDEELAAFAKAYARQCVWGHNKERGRRGENLFAITDEGMDV 
PLAMEEWHHEREHYNLSAATCSPGQMCGHYTQWWAKTERIGCG 
S HFCE KLQG VE ETNI ELLVCNYE P PGNVKGKR P YQEGTPCS QCP 
SGYHCKNSLCE P IGSPEDAQDLPYLVTEAPS FRATEAS DSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGLLLLPPLVLAGIF 


6434 


40 


2002 


MPQLNFGMADPTQMGGLSMLLLAGEHALGTPEVFSGTCRPDVSE 
SPELRQKSPLFOFAEISSSTSHSDASTKQCQTSALFQFAEISSN 
TSQLGGAEPVKRCGKSALFQLAEMCLASEGMKMEESKLIKAKES 
DGGR I KBLEKGKEEICEI KMEXTDETRLQKEAEFEKSAKENLRDS 
msjjKJM if fc. AJjQ IDDL MAI KMEDPKE IRKEELEEDHKCSHFPDFS Y 
SASSKIIISDVPSRKDHMCHPHGIMIIEDPAALNKPEKLKKKKK 
KSKMDRHGNDKS TP KKTCKKRQSS ESDrESVIYTI EAVAKGDWG 
IEKLGDTPRKKVRTSSSGKGSILDAKPPKKKVXSREKKMSKEKS 
SDTTTKESRPPDFISISASKNISGETPEGIKAEPLTPMEDAIiPPS 
LSGQAKPEDSDCHRK1ETCGSRKSERSCKGALYKTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 
TKPKEDCLLGSAKLDEEFBKKFNSLPQYSPVTFDRKCVPVPRKX 
KKTGNVSSEPTKTS KGSGDKWSNKQLFLDAI HPTEAI FSEDRNT 
MEP VHKVKNI PS I FNTPEPTTTARTFGGQPKEKSKENPDYS PCQ 
DTQRAGYHHEEVLWMTNLMNNCGGVYLKQLRHTAMTNA 


6435- 


2227 


657 


ALQRDAAAAYAHPEYEERFLQEETVSQQINSIELLQTRPLALPE 
v vrvayxi^jjyKUVHIjKGRPASQPTVIRGITYYECAKVSEEENDIEE 
Q0DEFFSGDNGVDLLIEDQLLRHNGLMTSVTRRPAATRG<3HSTA 
VTSDLNARTAPWSSALPQPSTSDPSIANHASVGPTLQTTSVSPD 
PTRESVLQPSPQVPATTVAHTATQQPAAPAPPAVSPREALMEAM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPAS PTLSPEEEDDIRNVI 
GRCKDTLSTI TGPTTQNTYGRNEGAWMKDPLAKDERI YVTNYYY 
GNTLVEFRNLENFKQGRWSNSYKLPYSWIGTGHWYNGAFYYNR 
AFTRNI I KYDLKQRY VAAWAMLHDVAYEEATPWRWQGHSDVDFA 

VDENGLWLIYPAIjDDEGFSnPVTVT.<?TfT.TJaanT C to zrc r .m t/-. 

LRRNFYGNCFVICGVLYAVDSYWQRNANISYAFDTHTNTQIVPR 
LLFENEYFYTTQIDYNPKDRLLYAWDNGHQVTYHVIFAY 


6436 


1295 


341 


GACR PP VRQD PDSG P DYEAL PAGATVTTHMVAG AVAG I LEHC VM 
YP I D C VKTRMQS LQ PD PAAR YRNVLEALWR 1 1 RTEGLWR PMRGL 
NVT ATGAG PAHAL YFACYEKLKKTLS DVI HPGGNSH I ANG AAG C 
VATLLHDAAMNPAE WKQRMQM YNS PYHRVTDCVRAVWQNEGAG 
AFYRSYTTQLTMNVPFQAIHFMTYEFU3EHFNPQRRYNPSSHVL 
SGACAGAVAAAATTPLDVCKTLLNTQESI.ALNSHITGHITGMAS 
AFRTVYQ VGG VTAY FRG VQARVI YQI PSTAIAWSVYBFFJCYLIT 
KRQEEWRAGK 


6437 


1828 


360 . 


P PAPAP PAS PAR HVTRTARGHLEGGSRAP PLLQAVFLQ I XNMVK " 

LIHTLADHGDDVNCCAFSFSLI1ATCSI1DKTIRLYSLRDFTELPH 

SPLKFHTYAVHCCCFSPSGHILASCSTDGTTVLWNTENGQMLAV 

MEQPSGSPVRVCQ FSPDSTCLASGAADGTWLWNAQS YKLYRCG 

SVKDGSLAACAFS PNGSFFVTGSSCGDLTVWDDKMRCLESEKAH 

DLGITCCDFSSQPVSDGEQGLQFFRLASCGQDCQVKIWIVSFTH 

ILGFELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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amino acid 
sequence 


Amino acid! segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=»Threonine, v= Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTLLLATGSMDKTVNIWQFD 
IjBTLCQARSTEHQLKQFTEDWSEEDVSTWLCAQDLKDLVGIFKM 
NNIDGKELLNLTKESLADDLKIESLGLRSKVLRKIEELRTKVKS 
LSSGIPDEPICPI TRELMKDPV I ASDG YS YEKE AMENWD P AKRN 
RTSPP 


6438 


109 


901 


EVQ I LRAKMFQTGGLI VF YGLLAQTMAQFGGLPVPLDQTLPLNV 
NPALPLSPTGLAGSLTNALSNGLLSGGLLG1LENLPLLDILKPG 
GGTSGGLLGGLLGKVTSVIPGLNNI IDIKVTDPQLLELGLVQSP 
DGHRL YVT I PLG I KLQ VNTPLVGAS LLRLAVKLD I TAE I LAVRD 
KQERIHJUVLGDCTHSPGSLQISLLDGLGPLPIQGLLDSLTGILN 
KVL PE L VQGNVCPLVNE VLRGLD I TL VHDI VNML IHGLQF VI KV 


6439 


23 


412 


SIQTASAITTEMASQSQGIQQJjLQAEKRAAEKVADARKRKARRL 

xqakeeaqmeveqyrrerehefqskqqaamgsqgnlsaeveqat 
rrqvqgmqssqqrnrervlaqllgmvcdvrpqvhpnyrisa 


6440 


3 


517 


RARWNSDMGDLPGLVRIiSIALRIQPNDGPVFYKVDGQ^F^QNRT 
IKLLTGSSYKVEVKIKPSTLQVENISIGGVLVPLELKSKEPDGD 
RWYTGT YDTEG VTPTKSG ERQP IQI TM PFTD IGTFETVWQ V KF 
YNYHKRDHCQWGSPFSVIEYECKPNETRSLMWVNKESFL 


6441 


234 


1373 ■ 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDALGY 
RNVCKENSTVGMKI QEELQRSGGLDHLVLSPGEWPVS DNT IMHI 
ATAEALTTDYWCLDDLYREMVRCYVE I VEKLPERRPDPATI EGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 
LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 
MLRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 
DSENKAIFPDNYDAEEREKTYRKWSSEGRGGRRGHDAPMIAYDA 
LLAAGN S WTELCH RAM FHGGESAATGTI AGCLFG I*L YGLDL VP K 
GliYQDLEDKEKLEDLGAALYRLSTEEK 


""6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGALAGLKTVSSYS" 
LQRQSLLD^SLVKLQLCHMLVEPNLCRSVLIANTVRQIQEEMTQ 
DGTWRTVAPQAAERAPLDRLVSTEILCRAAWGQEGAHPASGLGD 
GHTQG P VSDLCP VTSAQAPRHLQSS AW EMDG PRENRGS FHKSLD 
Q I FETLE TKNPS CMEELFSD VDS P Y YDLDTVLTGMMGGARPGP C 
EGLEGLAPATPGPSSSCKSDLGELDHWEILVET 


6443 

f\AAA 


2 


555 


MAS PAAS S VRP PR P KKEPQTIjVI PKNAAEEQ KLKLE RLMKN PDK ' 

AVPIPEKWSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHLRRR 

EYQRQDYMDAMAEKQKLDAEFQKRLEKNKIAAEEQTAKRRKKRQ 

KLKEKKLLAKKMKLEQKXQEGPGQPKEQGSSSSAEASGTEEEEE 

VPSFTMGR 


OHH'k 


390 


899 


GSTPRGKMRAPIPEPKPGDLIEIFRPFYRHWAIYVGDGYVVHLA'"" 
PPSEVAGAGAASVMSALTDKAIVKKELLYDVAGSDKYQVNNKHD 
DKYSPLPCSKI IQRAEELVGQE VL YKLTSENCEHFVNELRYGVA 
RSDQVRDVI IAAS VAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


AGAAGAAGAARS PRPQAHTKGVRGLP SRRRSPDCGRMELAAGS F* 
S E EQFWE ACAELQQP ALAG AD WQLIjVETSG I S I YRLLDKKTG L Y 
EYKVFGVLEDCSPTLLADIYMDSDYRKQWDQYVKELYEQECNGE 
ivvxwc»vis.xi'i? ^MbNRDYVYIiRQRRDJjUMEGRKIHVILARSTSM 
PQLGERSGVIRVKQYKQSLAIESDGKKGSKVFMYYFDNPGGQIP 
SWLINWAAKNGVPNFLKDMARACQNYLKKT 


6446 


1 


1651 


RCPTKSPPPbTPGSRGTTAMCSLASGATCGRGAVEMBEDLPELS ' 

DSGDEAAWEDEDDADLPHGKQQTPCLFCNRliFTSAEBTFSHCKS 

EHQFNIDSMVHKHGLEFYGYIKIjINFIRLKNPTVEYMNSIYNPV 

PWEKEEYLKPVLEDDLLLQFDVEDLYEPVSVPFSYPNGliSENTS 

VVEKLKHMEARALSAEAALARAREDLQKMKQFAQDFVMHTDVRT 

CSSSTSVIADLQEDEDGVYFSSYGHYGIHEEMLKDKIRTESYRD 

FI YQNPH I FKDXWLDVGCGTG 1 LSMFAAKAGAKKVLGVDQSE I 
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to first 
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amino acid 
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Predicted end 
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location 
corresponding 
to first 
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icsiaue or 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, KoLysine, 
L=Leucine, M=Methionine, N*Asparagine, 
P=Proline, QoGlutamine, R=Arginine, 
SaSerine, T»Threonine, WValine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


£447 


1 1554 




LYQAMDI 1RLNKLEDTITLI KGKIEEVHLPVEKVDVI ISEWMGY 
FLLFESMLDSVLYAKNKYLAKGGSVYPDICTISLVAVSDVNKHA 
DRIAFWDDVYGPKMSCMKKAVIPEAWEVLDPKTL1SEPCGIKH 
IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RWFSTGPQSTKTHWKQTVFLLEKPFSVKAGEALKGKVTVHKNK 
KDPR5LTVTLTLNN5TQTYGLQ 






1068 


RI^PASWHLSGPCHATU3AANRGRALGVRAAWRGAPLCQR\7mMP 
SRTNLATGI PSSKVKYSRLSS TDDGY I DLQFKXTPP K I P YKAIA 

LATVLFLIGAFLI I IGSLLLSGYISKGGADRAVPVLI IGILVFL 
PGFYHLRIAYYASKGYRGYSYDDIPDFDD 


6448 
6449 


j[ 74 


559 


GQVLSHC YH YRSS R WRRGGLS RGRGAGVMALVPYE ETTE FGLQK 
FHKPLATFS FANHTIQ I RQDWRHLGVAAWWDAA I VLSTYLEMG 
AVELRGRSAVELGAGTGL VGI VAALLACR I R YERDNN FLAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 




1 


1876 


EYGVCENLRKLE ITGVS CRliV VaKLLHRYRH £ LGLWQ PDIGP YG 
GLLNVWDGLFIIGWMYLPPHDPHVDDPMRFKPLFRlHIiMERKA 
ATVECMYGHKGPHHGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
FRT WLREEWGRTLED I FH EHMQEL I LMKF I YTSQYDNCLT YRRI 
YLP PS RPDDLIKPGLFKGTYGSHGLEI VMLS FHGRRARGTKI TG 
DPN I PAGQQTVE I DLRHRI QLPDLENQRNFNE LS RI VLE VRERV 
RQSQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGEPGDAVAAAKQPAQCGQGQPFVLPVGVSSRNEDYPRTCRM 
CFYGTGLIAGHGFTSPERTPGVFILFDEDRFGFVWLELKSFSLY 
SRVQATFRNADAPS PQA FDEMLKN I QS LTS 


6450 


84 8 


269 


FVPAPRTVSGKRSLPGEWEERGEGEQRTGRBFSGNGGRAVEAAR 
MRLLCGLWLWLSLLKVLQAQTPTPLPLPPPMQSFQGNQFQGEWF 
VLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWNAMTRGQHC 
DTWSYVLIPAAQPGQFTVDHRVWTHEQAGRPQDQPAGQELVAAS 
RDAGPVHLPGQSSGPLG 


6451 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 
XLPGLFLGPYSSAMKSKLPVLQKHGITHIICIRQNIEANFIKPN 
FQQLFRYLVLD IADNP VENI I RFFPMTKEFIDGS LQMGGKVLVH 
GNAG I SR S AA F V I A YI MET FGMXYRDAFAYVQERRFC INPNAG F 
VHQLQEYEAIYLAKLTIQMMSPbQIERSLSVHSGTTGSLKRTHE 
EEDDFGTMQVATAQNG 


6452 | 
6453 




652 


RTRGESSNMEPLAAYPLKdsfiPRAKVFAVLLSIVliCTVTLFLLQ 
LKFLKPKINSFYAFEVKDAKGRTVSLEKYKGKVSLVVNVASDCQ 
LTDRNYLGLKELHKEFGPSHFSVLAFPCNQFGESEPRPSKEVES 
FARKNYGVTFPIFHKIKILGSEGEPAFRFLVDSSKKEPRWNFWK 
YLVNPEQQWKFWRPEEPIEVIRPDIAALVRQVUKKKEDL 


6454 


827 




HRRWLPGLSMSPRRTLPRPLSLCI^LCLOiCLAAALGSAQSGSC - 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


" 6455 r 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCIiSIiCLCI.rr.AAATng&rtor'c^ — 
RDKKNCKWFSQQELRKRLTPLQYHVTQBKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVIWSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
P ADS SGTAEGGSGVAS PAQADKAEL 




1042 


173 


RVHLATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQI* " 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLY I E I KRG VTEDDGRP I YALVNLATTS IS KMATDFAENELDLF 
RKALELIIDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLI E KEG E FTLHGRAILEMEQ Y IRETYPDAVK1 CN I CHS L 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, DsAspartic Acid, E= 
Glutamic Acid. F= Phenyl a 1 anine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, T»Threonine, V»Valine, 
W -Tryptophan, Y -Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETOSIRMHLPCVAKYFQSNAEPRCPHCNDYWPriEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


R PQSRS ISMWRNSLLQVS SGLRWLRVCAMVD ILGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDPYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLAIiPFHDPYRHELR 
KRYNVTAIPKLVIVKQNGEVITNKGRKQIRERGLACFQDWVEAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKl tHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
IILGKQYSLNIILSVFAIILGAFIAAGSDLAFNLEGYIFVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAI KNVSVAYIGI L IGGDYI FSLLNFVGLNI CMAGGLR Y 
S FLTLS SQLKP KP VGE EN I CLDLKS 


6458 


! 23 


892 


PTTGFP VTN F P WNWPDG KP P I M 1 L Y Vi> KLN K X U k FPDFDKKI P V 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
IILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGY1 FVFLND 
IFTAANGVYTKOKMDPKELGKYOVLFYNAPFMTTPTT.T T QV<5Tf% 
DLQQATEFNQWKNWF ILQFLLS CFLGFLLMYS TVhCS Y YNSAL 
TTAWGAI KNVSVAYIG I LIGGDYI FS LLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEEN I CLDLKS 


6459 


23 


H32 


PTTGFPVTNFPtWWPDGKPPIMILWSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLBT 
IILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
IPTAANGVYTKOKMDPKELGKYaVLFYNACPMT TPTT.T T QVQTT; 
DLMATEFNQWKNWFILQFLLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAI KNVS VAY IG 1 L IGGD Y I FS LLNF VGLN I CMAGGLR Y 
SFLTLSSQLKPKPVGEEN I CLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMtLYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
IILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANG VYTKQ KMDPKELGK YGVLFYNACFMI I PTLI ISVSTG 
DLQQATE FNQWKNWFI LQ FLLS CFLG FLLMYS TVLCSY YNSAL 
TTAWGAI KNVS VAYI GIL IGGDY I FS LLNFVG LN I CMAGG LR Y 
SFLTLSSQLKPKPVGBENICLDLKS 


6461 


1653 


360 


LQQRTLR I TAVGQTHP I AWMAWE P S LGAF YGPAS F I TFVNCM YF 
LSIFIQLKRHPERKYELKEPTEEQQRLAANENGEINHQDSMSLS 
L I STS ALEN EHT FHSQLLGAS LTLLL YVALWM FGALAVSLYYP L 
DLVFS F VFGATSLSFSAFF WHHCVNREDVRLAW IMTCCPGRSS 
YSVQVNVQPPNSNGTNGEAPKCPNSSAESSCTNKSASS FKNS S Q 
G CKLTNLQAAAAQ C HANS L PLNS TPQLDNS LTEHSMDND I KMHV 
APLEVQFRTNVHSS RHHKNRS KGHRASRLTVLREYAYDVPTSVE 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
STLPKSSRNFEKPVSTTSKKDALRKPAWELENQQKSYGLNLAI 
QNGP I KSNGQEGPLLGTDSTGNVRTGLWKHETT V 


6462 


3 


773 


SEELDREKKLKBDS PRKTPNKESGVPSLPVSLTS IKEEPKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YLHAYPYPQMYDPSHPAYRAVS PVLMHSYPGAYLS PGFHYPVYG 
KMSGREETE KVNTS PS VNTKTTTESKALDLLQQHANQYRSKSPA 
PVEKATAEREREAERERDRHS PFGQRHLHTHHHTHVGMGYPL I P 
GQYDP FQGLTSAALVASQQVAAQASASGM FPGQRRE 


6463 


2 


350 


VILCILGGWIFKNADRSMEKKKGEPRTRAEARPWVDEDLKDSSD 
LHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 ] 12 


1154 


GILRQKEREERNRIHKKEILFLEHLLWPSEMSSLSGKVQTVLG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=sHistidine, I*=Isoleucine, K=Lysine, 
I^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R«*Arginine, 
S=Serine, T=Threonine, V^Valine, 
WaTryptophan, Y=Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LVEPSKLORTbTHEHLAMTFDCCYCPPPPCQEAISKBPIVMKNL 
YWIQKNAYSHKENLQLNQETEAIKEELLYFKANGGGALVENTTT 
G1SRDTQTLKRIAEETGVHIISGAGFYVDATHSSETRAMSVEQL 
TDVLMNEILHGADGTS IKCGI IGEIGCSWPLTESERKVLQATAH 
AQAQLGCPVIIHPGRSSRAPFQIIRILQEAGADISKTVMSHLDR 
TILDKKELLEFAQLGCYLEYDLFGTELLHYQLGPDIDMPDDNKR 
I RRVRLLVEEG CE DR X LVAHD IHTKTRLMKYGGHG YSH I LTNW 
PKMLLRG I T ENVL DKI L I EN P KQWLTFK 


6465 


126 


1356 


KMTVFFKTLRWHWKKTTAGLCLLTWGGHWL YGKHCDyLLRRAAC 
QEAQVFGNQL I PPN AQVKKATV FLNPAACKG KARTLFE KNAAP I 
LHLSGMDVTIVKTDYEGQAKKLLELMENTDVIIVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATIiA I VKG ETVP LD VLQ I KGEKEQP VFAMTGLRWGS FRDAGV 
KVSKYWYLEPliKIKAAHFFSTLKEWPQTHQASrSYTGPTERPPN 
EPEETPVQRPSLYRRILRRLASYV7AQPQDALSQEVSPEVWKDVQ 
LSTIELSITTRNNQLDPTSKEDFLNICIEPDTISKGDFITIGSR 
KVRNPKIiHVEGTECLQASQCTLLIPEGAGGSFSIDSEEYEAMPV 
EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


11*4 


828 


VARGTELSQLEKAHPPADMGRRKSKRKPPPKKKMTGTLETQFTC 
PFCNHEKS CD VKMDRARNTG VIS CT VCLEE FQT P ITYLS EPVDV 
YSDWIDACEAANQ 


6467 


301 


2571 


GELRVtALAHGELACHAVLTAS LLS LRSRLMDS DMDYER PNVET 
IKCVWGDNAVGKTRLI CARACNATLTQYQLLATHVPTVWAIDQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
VVLCFS IANPNSLHHVKTMWYPE I KHFCPRAPVILVGCQLDLRY 
ADLEAVNRARRPLARPIKPITEIIiPPEKGREVAKELGIPYYETSV 
VAQFGIKDVFDNAIRAALISRRHLQFWKSHLRNVQRPLLQAPFL 
PPKPPPPIIWPDPPSSSEECPAHLLEDPLCADVILVLQERVRI 
FAHKIYLSTSSSKFYDLFLMDLSEGELGGPSEPGGTHPEDHQGH 
SDQHHHHHHHHHGRDFLLRAASFDVCESVDEAGGSGPAGLRAST 
SDG I LRGNGTG YLPGRGR VLSS WSRAFVS I QEEMAEDPLTYKS R 
LMVVVKMDSSIQPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFDLRMMVANILNNEAFKINQEITKAFHVRRTNRVKECLAKGT 
FS D VTF I LDDGT I S AHKPLL I S S CD WMAAMFGG P FVES STREW 
FPYTS KSCMRAVLEYLYTGMFTSSPDLDDMKLI ILANRLCLPHL 
VALTEQYTVTG LMEATQMM VD IDGDVLVFLELAQFHCAYQLADW 
CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 
EDH YQRARKERE KED YLHLKRQ PKRRWLF WNS PS S PS SS AAS S S 
SPSSSSAW 


. 6468 


3 


1374 


DAWAGTNMAAIAPVGSPASRGPRLAAGLRLLPMLGLLQLLAEPG 
LGRVHHIiALKDDVRHKVHLNTFGFFKDGYM\AT5A^SSLSLNEPED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYClLKKQSVSVTLLIIiDI 
SRSEVRVKSPPEAGTQLPKI I FSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKS KRSTVDSKAMGEKSFS VHNNGGAVS FQFFFNIS 
TDDQBGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GEIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAIi 
AAt>jjoijvi?tiKiL»XHx lobybr PllsbWAVVYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTTEYGLWKDSIiFLVDLLCCGAILFPWWS IRHLQEAS ATD 
GKGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRLIiPMLGLLQLLAEPG 
LGR VHHIALKD DVRHKVHLNT FG FFKDG YMWNVS S LS LNE PED 
KDVT IG FS LDRTKNDG FSS YLDEDVN YCI LKKQS VS VTLL I LD I 
SRSEVR VKS PPEAGTQLPKI I FSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKS KRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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SEQ 
ID 
NO: 


r£euic tea 

beginning 

nucleotide 

location 

cor re opondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H^Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, v»Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIBGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTTB YGLWKDS LF LVDLLCCGAI L FP WWS I RH LQE ASATD 
GKGKFSRAHFVLLSLL 


6470 




1437 


AAASGVSS RADAP VLAQSPASAGNGRPSTP RVPGSRRHPSAPRS 
G PL PREDGCRT PGPQIiLPLPG ALLR PRTLLS S AAETG RS RHPDT 
QHPSSGGRCRGGTESPSSAAGRPASMAEAEEDCHSDTVRADDDE 
ENES PAETJ3LQAQLQM FRAQWM FELAPG VS S SNLENR PCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQL V PD I E FKI TYTRSPDGDGVGNS YI EDNDDOS KMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCrKLV 
P YTS WREM FLERPR VR FDG VYI S KTTYI RQGEQS LDGF YRAWHQ 
VEYYRYIR FFPDGHVMMLTTPEE PQSI VPRLRTR 


6471 


1750 


293 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GPRNKKRGWRRLAQEPLGLEVDQFLEDVRLQERTSGGLLSEAPN 
EKLFFVDTGSKEKGLTKKRTKVQKKSLLLKKPLRVDLILENTSK 
VPAPKDVLAHQVPNAKKLRRKEQLWE KLAKQGE LPREVRRAQAR 
LLNPSATRAKPGPQDTVERPFYDLWASDNPLDRPLVGQDEFFLE 
QTKKKG VKR FARLHTK PS QAPAVEVAPAGAS YNPS FEDHQ TLLS 
AAHEVELQRQKEAEKLERQLALPATEQAATQESTFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLArrEKKrEQQRRREKA 
VHRLRVQQAALRAAR LRHQE LFRLRG I KAQVALRLAELARRQRR 
RQARREAEADKPRRLGRLKYQAPDIDVQLSSELTDSLRTLKPEG 
NILRDRFKSFQRRNMIEPRERAKFKRiCYKVKLVEKRAFRBIQL 


6472 


3 


897 


S CGS DRAQWAME F PFD VDALF PERITVLDQHLRPPARR PGTTTP 
ARVDLQQQIMTI IDELGKASAKAQNLSAPITSASRMQSNRHWY 
ILKDSSARPAGKGAIIGFIKVGYKKLFVLDDRBAHNEVEPliCIL 
DFYIHESVQRHGHGREIiFQYMLQKERVEPHQLAIDRPSQKLIjKF 
bNKHYNLETTVPQVNNFVI FEGFFAHQHRPPAPSLRATRHSRAA 
AVDPTPAAPARKLPPKRAEGD I KPYSSSDREFLKVAVEPPWP LN 
RAPRRATP PAHP PPRSSSLGNSPERGPLRPFVP 


6473 


22 


912 


SSAVEFVWEGEfCMAAEPNKTEIQTLFKRLRAVPTNKACFDCGAX 
MPS WAS IT YGVFLCI DCSGVHRSLG VHIjSFIRSTELDSNWNWFQ 
LRCMQ VGGNANATAFFRQHGCTANDANT KYNS RAAQM YREK I RQ 
LGSAALARHGTDLWIDNMSSAVPNHSPEKKDSDFFTEHTQPPAW 
DAPATEPSGTQQPAPSTESSGLAQPBHGPNTDLLGTSPKASLEL 
KS S I IG KKKPAAAKKGLGAKKGLGAQKVSSQS FS E I ERQAQVAE 
KLREQQAADAKKQAEESMVASMRLAYQELQIDR 


6474 


3 


462 


ijQRQRQHPAAAPAVPVRCFTFCFTDI VI MPKRKS PENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSBNGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTBN 


647S 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KG KKEE XQEAG KEGTAPS ENGETKAEE IH I S PfiT VNV<5TQd r t n 
PSTLSVKGQIETVRVKGTEN 


647* 


106 


1090 


ARAMAQ YKGTMREAG RAMH LL KKRERQR EQME VLKQR I AEET I L 
KSQVDKRFSAHYDAVEABLKSSTVGLVTLNDMICARQEALVRERE 
RQLAKRQHLEEQRLQQERQREQEQRRERKRKISCLSFALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQREKVKDEEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIARARGK 
SGPLFSFDVHDDVRLLSDATMEKDESHAGKWLRS W YEKNKHI F 
PASRWEAYDPEKKWDKYTIR 
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ID 
NO: 


f XCUlLLeQ 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
resiuue or 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cyateine, D«Aspartic Acid, B= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S»Serine, T=Threonlne, V«Valine, 
W«Tryptophan, Y= Tyrosine, X-Unknown, *«Stop 
Codon, /spossible nucleotide deletion, 
\=possible nucleotide insertion) 


6477 


227 


915 


LQGHIJ^IMAASRPLSRFWEMGKNIVC^GRNYADHVREMRSAVL 
SE P VLFLKPS TAYAP EGS P ILM P AYTRNLHHE LELG VVMGKRCR 
AVPEAAAMDYVGGYALCIiDMTAHDVQDECKKKGLPWTLAKSFTA 
SCPVSAFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMIFSIPY 
1 1 S YVSK1 1 TLEEGDI I LTGTPKGVGPVKENDE I EAG IHGLVSM 
TFKVEKPEY 


647B 


2 


1495 


FVSSR1LPESLASSEASTLEAMGRKEEDDCSSWKKQTTNIRKTF 
IFMEVLGSGAFSEVFLVKQRLTGKLFALKCIKKSPAFRDSSLEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RGVYTEKDASLVIQQVLSAVKYLHENGIVHRDLKPENLLYLTPE 
ENSKIMITDFGLSKMEQNGIMSTACGTPGYVAPEVLAQKPYSKA 
VDCWSIGVITYILLCGYPPFYEETESKLFEKIKEGYYEFESPFW 
DDIS ESAKDF I CHL LE KDPNERYTCEKALS HP W I DGNTALHRD I 
YPS VSLQIQKNFAKS KWRQAFNAAAWHHMRKLHMNLHS PGVRP 
EVENRPPETQASETSRPSSPEITITEAPVLDHSVALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSLAAGPCGCC 

SSCLNIGSKGKSSYCSEPTLLKKANKKQNPKSEVMVPVKASGSS 
HCRAGQTGVCLIM 


6479 


3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALSFSRVPLPPVFDLS " 
YFIVSILYLKYEPGAVELSRRIIPIASWLCAMLHCFGSYILADLL 
LGEPLIDYFSNNSSILLASAVWYLIFFCPLDLFYKCVCFLPVKIj 
IFVAMKEVVRVRKIAVGIHHAHHHYHHGWFVMIATGWVKGSGVA 
LMSNFEQLLRGVWKPETNE 1 LHMS FPTKASLYGAI LFTLQQTRW 
LPVSKASLI FI FTLFMVS CKVFLTATHSHSS PFDALEGYI GPVL 
FGSACGGDHHHDNHGGSHSGGGPGAQHS AMPAKS KEELS EGSRK 
KKAKKAD 


6480 


192 


514 


^DFMSIYFPI HCPDYLRS AKM TE VMMNTQ PMEE I GLSPRKDGLS Y 

QIFPDPSDFDRCCKIiKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDVVWMFVIAGGTLAIPIIiAFVASFLLWPSALIRiYYWY 
WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPS ILMLHGFSAHKD 
MWLSWKFLPKNLHLVCVDMPGHEGTTRSSLDDLSIDGQVKRIH 
QFVECLKLMKKPFHLVGTSMGGQVAGVYAAYYPSDVSSLWLVCP 
AGLQYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMIiQLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQ 
NMDXIKVPTQI I WGKQDQVLDVSGADMLAKS XANCQVELLENCG 
HS WMERPRKTAKLI iDFLASVHNTDNNKKbD 


6482 


2517 


568 


E P VS KVSQSRRKAGVP TAN I EES QAVE AAMANVP WAE VCE KFQA 
ALALSRVELHKNPEKEPYKSKYSARALLEEVKALLGPAPEDEDE 
R PEAEDGPGAGDHALGL PAE WE PEG P VAQRAVRLAVT EFHLG V 
NHIDTEELSAGEEHLVKCLRLLRRYRLSHDCISLCIQAQNNLG1 
LWSEREEIETAQAYLESSEALYNQYMKEVGSPPLDPTERFLPEE 
E KLTEQERS KRFE KVYTHNLYYLAQVYQHLEMFEKAAH YCHSTL 
KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 
FGQTGKI SATEDTPEAEGE VPEL YHQRKGE I ARCW I KYCLTLMQ 
NAQLSMQDNIGELDLDKQSELRALRKKELDEEES I RKKAVQFGT 
GE LCDAI SAVEE KVS YLRPLDFEE ARELFLLGOW YV pp a w vvn 
I DG YVTDH I E WQDHS ALFKGLAF FETDMERRCKMHKRR I AMLE 
PLTVDLNPQYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SH I VKK INNLNKS ALKY YQL FLDS LRDPNKVFPEH I GEDVLRPA 
MLAKFRVARLYGKIITADPKKELENLATSLEHYKFIVDYCEKHP 
EAAQE IE VE LE LSKEMVS LLPTKM E RFRTKMALT 


6483 


3 


623 


NSHliLCGLRAKAPLSANGREARAMEQRLAEFRAARKRAGLAAQP 
PAASQGAQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQ PQG STSET PWNTAI P LPSCWDQS FLTNI TFLKVLLW 
LVLLGLFVBLEFGLAYFVLSLFYWMYVGTRGPEEKKEGEKSAYS 
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; seq 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F* Phenylalanine, G«Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 

-«aci.^jic, x = xiix tititiAtis t v ^ valine, 
W=Tryptophan, Y=.Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6484 


201 


965 


VFNPGCEAIQGTLTAEQLERELQLRPLAGR 

QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGSSIGJJCPFCQRLF 

MILWLKGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 

DFIKIEEFLEQTLAPPRYPHLSPKYKESFDVGCNLFAKFSAYIK 

N7QKEANKNFEKSLLKEFKRLDDYLNTPLLDEIDPDSAEEPPVS 

K « ** r ulaj uy ii i LtAUL o U Jj FKIiN 1 1 KVAAKKYRDFD I PAE FSGVW 

RYLHNAYAREEFTHTCPEDKE I ENTYANVAKQKS 


6485 


6 


1091 


FVDLVRAVEFLPCPDSQKLEKECQSSEESMGSNSMRSILEEDEE 
DEEPPRVLIjYHEPRSFEVGMLVWHKHKKYPFWPAWKSVRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
ie WC VS L X TDYR VRLGCX5S FAG S FL B YYAAD I S Y P VR KS I Q 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRFCMLPDRSRAARD 
RANQKLVEYIGKAKGAESHLRAILKSRKPSRWLQTFLSSSQYVT 
CVETYLEDEGQL.DLWKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VLL PEAI I CA I S AGDE VDYKTAEE KYI KG P SLS YRE KB I FDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHli& PSRVTQG I Y YMi*AFS EMPKPPDYSELSDSLTLA 
GGTGRFSGPLHRAWRMMNFRQRMGWIGVGLYLLASAAAFYYVFE 
ISETYNRLALEHIQQHPEEPLEGTTWTHSLKAQLLSLPFWVWTV 
I FLVPYLQMFLFLYS CTRADPKTVG YCI IPICLAVICNRHQAFV 
KASNQISRLQLIDT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFIJlLtASN " 
YYNGCI FHRN I KG FMVQTGD PTG TGRGGNS IWGKKFEDE YSEYL 
KHNVRG WSMANNG PNTNGS QFF I T YG KQPHLDMKYT VFGKV I D 
GLETLDELEKLPVNEKTYRPLNDVHI KDITIHANPFAQ 


6468 


878 


241 


TALQEFGTSGPPLSLRFALPSGTGRPKPLPGARGPSWPPSPRVP 
ME P PNL YP VKL YVYDLS KG LARRLS P IMLGKQLEGI WHTS I WH 
KDE F F FGSGG I S SCPPGGTLLGP PDS WDVGS TE VTEE I FLE YL 
SS LGESLFRGEAYNLFEHNCNTFSNE VAQFLTGRKI PSYITDLP 
SEVLS TPFGQALRPLLDS I Q IQPPGGSS VGRPNGQS 


6489 


1457 


375 


KVA^TAI^EEELDNEDYYSIiNVRRE^SEELKAAYRRLCML"" 
YHPDKHRDPELKS QAERLFNLVHQAYE VLSDPQTRAI YDI YGKR 
GLEMEGW E WE RRRTPAE 2 R EEFE RLQRERE E RRLQQRTNPKGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPQIEINKMHISQS I EAPL 
TATDTAI LSGSLS TQNGNGG GS INFALRRVTSAKGWGELE FGAG 
DLQGPLFGLKLFRNLTPRCFVTTNCALQFSSRGIRPGLTTVliAR 
NLDKNTVGYLQWHCSSPLLQVQRPHRNTRACAPEPS FRPFLHVP 
i nuH.c*^z>KjHjx. x ±>z> XJVN XoAAVJUjKiaACLSGPGSGSHQLLLTfTPR 
SKRRTGGG 


6490 


3 


1183 


HEAG CE VWLG YG P RAAAAAAATVLFGGAG PTETM FVARS I AADH " 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGS VWR VTWAHP E FGQVLAS CS FDRTAAVWEB I VG ESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYS 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAKSPMIAVGSD 
DSSPNAMAKVOI FE YNENTRKYAKAETLMTVTDPVHDI AFAPNL 
GRSFHI LAI ATKDVRI FTLKPVRKELTSSGGPTKFE IHI VAQFD 
NHNSOVWRVSWNITGTVLASSGDDGCVRLWKANYMDNWKCTGIL 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSLNGSSAGRKHS 


6491 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPTETMFVARSIAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HS GS VWR VTWAH PE FGQ VLAS CS FDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYE 
APDVMNLSQWSLQHErSCKLSCSCISWNPSSSRAHSPMIAVGSD 
DS SPNAMAKVQ I FE YNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFIIIIAIATKDVRI FTLKPVRKELTSSGGPTKFE IHI VAQFD 
NHNSQWRVSWNITGTVLASSGDDGCVHILWKANYMDITWXCTGI L 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R«Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan, YoTyxooine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 
KGNGSPVNGSSOQGTSNPSLGSNIPSLQNSLNGSSAGRKHS "™" 


S492 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTEHGTPKPFRK 
PDSVAFGESQSEDEQPENDI,BTDPPNWQQLVSREVLLGLKPCEI 
KRQ E VI NBL FYTERAHVRTLKVLDQVF YQRVSREGI LS P S ELRK 
IFSNLEDIL0LKIGLNEQMKAVRKR2JETSVIDQIGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDI I PTQMQRLTKYPLLLDNIATYTEWPTEREKVKK 
AADH CRQI LNYVNQAVKEAENKQRLE D YQRRLDTSS LKLS EY PN 
VEELRNLDLTKRKMIHEGPLWKVNRDKTIDLYTLLLEDILVLL 
QKQDDRLVLRCHS KI LASTADSKHT FS P VI KLSTVLVRQVATDN 
ivMijr v ibW5>Df\K»Ayi YEIjVAQTvSEKTVWQDLICRMAASVKEQS 
TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 
LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 
EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLLVQQLGLT 
EKSVQEDWQHFPRYRTASQGPOTDSVIQNSENIKAYHSGEGHMP 
vxi\3± izui nillb rti 1 ia i t APRDS VGLAPQDS QASNIL VMDH 
M IMT PEMPTM E P EGGLDDSG EHFFDARE AHSDENPS EGDG AVNK 
EE KD VNLRI SGNYL I LDG YDP VQES S TDEE VAS S LTLQPMTG I P 
AVBS THQQQHSPQNTHSDGAI S PFTPEFL VQQRWGAME YS CFE I 
QSPS S CADSQSQ 1MB YIHKI EADLEKLKKVEES YT I IiCQRIAGS 
ALTDKH3DKS 


6493 


557 


1147 


TPARMAYQGS STSDCMS KTLDSASAHFAAS A WS AP VPSR S EVA 
KEQNTGHNNINGWQPSGTSKTLYSTNMAIjSSSPGISAVQLVRT 
VGHTTTNHLI PALCTS SPQTL PMNNS CLTNAVHLNNVS WS PVN 
VHINTRTSAPSPTALKIATVAASMDRVPKVTPSSAISSIARENH 
EPERLGLNGIAETTVAMEVT 


6494 


242S 


1052 


A V iWoOAKFU* J. PbtJ FHRRCRKHRlr'R k> Jj fKFPAAI HSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPILMEKEEEGMLSPILAHGG 
VRFMWIKHNNLYLVATSKKNACVSLVFSFLYKWQVFSEYFKEL 
EEESIRDNFVIIYELLDELMDFGYPQTTDSKILQBYITQEGHKI, 
ETGAP RP PATVTNAVS WRS EG I KYRKNEVFLDVI BSVNLLVSAN 
GNVLR S E I VGS I KMRVFLSGM PELRLGLNDKVLFDNTGRGKS KS 
VE LED VKFHQC VRLS RFENDRTI S FI PPDGEFELMSYRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSEI VWS I KS FPGGKE YLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
PWVRYITQNGDYQLRTQ 


6495 


2425 


1052 


AVAGGARPC f STP9SPHPPr > PPH , PPPPT PPDD&^'tvctv on*r<>iW is" 

LKGKVL I CRNYRGDVDM S EVEHFM PI LME KEEEGMLS P I LAHGG 
VRFMW IICHNNLYLVATS KKNACVSLVFS FL YKWQ VFSE YFKEL 
EEE S IRDNFVI I YELLDELMD FGYPOTTn q v r t /ycv t tt» pnu vr 
ETGAPRPPAT VTNAV5WRSEG I KYRKNEVFLDVI ESVNLLVS AN 
GNVLRSEIVG3IKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQ CVRLS R FENDRT I S FI PPDGE FELMS YR LNTH VK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSEIWSIKSFPGGKEYLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
PWVRYITQNGDYQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPEYSZHSLFCIMFLCAQEWLTLGLNVPLLFY - 

HFWRYFHCPADSSELAYDPPVVMNADTLSYCQKBAWCKLAFYLL 

SFFrYLYCMIYTLVSS 


6497 " 


1053 


352 


ANTQICRLCPRRHLHPPCGAKKGNGTEEDYNFVFKWLIGBSGV 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
TAGLER YRAI TSAYYRG AVGALL VFDLTKHQT YAWER WLKE LY 
DHAEATIVVMLVGNKSDLSQAREVPTEEARMFAENNGLLFLETS 
ALDSTNVELAFETVL KE I FAKVS KQRQNS IRTNAI TLGSAQAGQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, YeTyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








BPGPGBKRACCISL 


G498 


2636 


272 


SLRIiCPWGTHLAGPTTMRLSSLLALIjRPALPLILGLSLGCSLSt 
LRVSWIQGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VPYYRDPNKPYiCKVLRTRYIQTELGSRERLLVAVLTSRATLSTL 
AVAVNRT VAHHF PRLL YPTGQRGARAPAGMQWS HGD ER PAWLM 
SETLRHLHTHFGADYDWFFIMQDDTYVQAPRLAALAGHLS XNQD 
LYLGRAEEFIGAGEQARYCHGGFGYLLSRSIiLLRLRPHLDGCRG 
DILSARPDEWLGRCLIDSLGVGCVSQHQGQOYRSFELAKNRDPB 
KEGS SAFLS AFAVHP VS EGTLMYRLHKRFSALELERAYS E I EQL 
QAQIRNIiTVLTPEGEAGLSWPVGLPAPFTPHSRFEVLGWDYFTE 
GHTFSCADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LRPLSRVE 1 LPMPYVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRLAWLAVRAEAPSQVRLMDWSKKHPVDTLFFLTTVWTRPG 
PE VLNRCRMNAI SGWQAFFPVHFQEFNPALS PQRSPPGPPGAGP 
D P P S P PG AD PS RG AP IGGRFDRQAS AEGCF YNAD YLiAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CSPRLSEELYHRCRLSNLEGLGGRAQLAMALFEQEQAKST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGLRAAAGAFRTGSPLALGPETPQVACLP 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGSISRAEAEEHLKLAGM 
ADGLFLLRQCLRSLGGYVLSLVHDVRFHHFP I ERQLNGTYAIAG 
GKAHCGPAELCE FYSRDPDGLPCNLRKPCNR PSGLEPOPGVFDC 
LRDAMVRDYVRQTWKLEGE AliEQAI I S Q APQ V EKL I ATTAHERM 
PWYHS SLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALSLI YG 
KTVYH YL ISQDKAGKYC I PEGT KFDTLWQLVE YL KLKADGL I YC 
LKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKXLFLKRDNL 
L IAD I E LGCGN FGS VRQG VYRM RKKQ I D VAI KVLKQGTE KADTE 
EMMREAQIMHQLDNPYIWLrGVCQAEALMLVMEMAGGGPLHKF 
LVGKREEIPVSNVAELLHQVSMGMKYLEEKNFVHRDLAARNVLL 
VNRHYAKISDFGLSKALGADDSYYTARSAGEO^PLKWYAPECINF 
RKFSSRSDVWS YG VTMWEALS YGQKP YKKMKGPEVMAFI EQGKR 
MECP PE C PPELY ALMSDCW I YKWEDRPD FLTVEQRMRAC Y YS LA 
S KVEGPPGSTQKAEAACA 


6500 


1773 


72* 


TGPTHAS ADAWGLVRS VTBWCANVRGNP CAAALS C PQAVLDAGK 
MLS ES S S FLKG VMLG S I FCAL I TMLGH I R IGHGNRMHHHEHHHZj 
QAPNKED ILKIS EDERMELSKS FRVYCI ILVKPKDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFES INMDTNDMWLWMRKAYKYAFDK 
YRIK2Y1JW7FLARPTTFAIIENLKYFLLKKDPSQPFYLGHTIKSG 
DLEYVGM EGGI VLS V3SMKRLNS LLN I PEKC PEQGGM I W K I S ED 
KQLA VCL K YAG VFAENAEDADGKD VFNTKS VGLS I KEAMT YH PN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


6501 


1 


570 


LVGMSGGGTETPVGCEAAPGGGSKKRDSLGTAGSAHLIIKDLGE 
rHSRLLDHRPVIQGETRYFVKEFEEKRGLREMRVLENLKNMIHE 

HSDHLVASEKQHNLQWDNFMKEQPNKRAEVDEEHRKAMERLKEQ 
YAEME KDLAKFSTF 


6502 


213 


i*So 


AGMKPDPWAGRNRTAVLPDVSVFHREDVGWIVRSWLQQSYQAVKE"'" 
KSSEALE FMKRDLTE FTQWQHDTACTI AATASWKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSG? 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGEISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVHQLEQ. 
EQARRDALKQRAEQS I SEEPGWEEEEEBLMG I SPI S PKEAKVPV 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


lunino acid segment containing signal peptide 
(A=Alanine, OCysteine, D<=Aspartic Acid, Ba 
Glutamic Acid, Fe Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, K=Methionine, N-Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
SaSerine, T=Threonine , V=Valine, 
"^Tryptophan, y =Tyrosine# x-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISBDWEKDFDLDMTEEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 


1650 


AGN KPD P WAGRNRTA VL PDVS VFHREDVGW WRS WLQQS YQAVKE 
KS S E ALB FMKRDLTE PTQWQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDPLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGEISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFyKVHQLEO 
EQARRDALKQRAEQS I S EE PGWEE E EKE LMG I S P ISP KEAKVPV 
AKISTFFEGEPGPQSPCRKNLVTSVEPPAEVTPSESSE5ISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTEEEVQMALSKVDASG 
EVSG PGGS EGSE PNG PGCESSPQPAQLS PQEGPCSCLR 


6504 


2131 


1294 


GKVC^VAHWVCLSILSPPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITQ 
WKGTVLDQ VP INPSLYLVKYDGI DCVYGLELHRDE R VLS LKI LS 
DRVAS SHI SDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGVVDGLIGKHVEYTKEDGSKRIGMVTHQVEAKPSVYFIKF 
DDDFH I YVYDL VKKS 


6505 


2131 


1294 


GKVCLVAHWVCLS ILS P PPAGMKTPNAQEAEGQQTRAAAGRATG ' " 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITO 
WKGTVLDQVP INPSLYLVKYDGI DC VYGLELHRDERVLSLKILS 
DRVAS SHI SDANLANTI I G KAVEHM FEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGG WDGLIGKHVE YTKEDGSKRIGMVIHQVEAKPSVYFI KF 
DDDFH I YVYDLVKKS 


6506 
[ 6507 






EVS P PTS CCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGS T 
ELVEDSHYSQSQLVCSDCGCWTEGVLTTTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAVYQQAY 
RHSG IRAARLQKKEVLVGCCVLITCRQHNWPLTMGAI CTLLYAD 
LDV FSS T YMQ I VKLLGLD VPS LCLAEL VKTYCS S FKLFQAS PS V 
PAKYVE DKE KMLS RTMQLVE LANETWLVTGRH PLP VI TAATFLA 
WQSLQPADRLSCSLARFCKLANVDLPYPASSRLQELLAVLLRMA 
EQLAWLRVLRLDKRS WKH I GDLLQHRQS LVRSAFRDGTAB VET 
KiiiUi^^owisCGOGEGSVGNNSLGLPQGKRPASPALLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 




1878 


929 


RSHASRLPELPSGCLVLQVQELVQMSGMEATVTI P I WQNKPHGA 
ARS WRR I GTNLP LKPCARAS FE TLPN I S D jCLRD Vp P VPTIAD 
IAWI AADEEETYARVRSDTRPLRHTWKPS PL I VMQRNASVPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTPDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETEVE 
VPELPSVPLLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMGILKDFHRMKQSQDLNRSLLKEEDPAVLISEVLRRKFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTPRKMAAARP 
SI/3RVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQ VHVWDACS S RSQ VDRLVALARMRQSGAFL S TSEGL 
ILQLVGDAVHPQFKEIQKLIKEPAPDSGLLGLFQGQNSLLH 


6509 


2 


1053 


FVVWPRGGRKKRRQAAVTQAATRASGTPSPRDGTMTQGKLSVAN 
KAPGTEGQQQVHGEKKEAPAVPSAPPSYBEATSGEGMKAGAFPP 
APTAVPLHP S VIA YVD PSSSSS YDNG FPTGDHEL FTTPS WDDQKV 
RR VFVR KVYT I LL IQ LLVTLAWALFTFCD P VKDYVQAN PG WY W 
ASYAVFFATYLTLACCSGPRRHFPWNLILLTVFTLSMAYLTGML 
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SEQ 
ID 
NO: 


I Predicted 
beginning 
nucleotide 
location 
corre spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenvlal»ni n*» n—m ***** —~ 
H=Histidine, I»Isoleucine, K*bysine, 
LaLeucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\»poceible nucleotide insertion) 








SSY^NTTSvjjbCWJITAIiVCIiSVTVFSFOTKFDFTSCWVLFVL™ 
LMTLFFSGLILAILLPFOYVPWI.HAVYAAi.rc&rjuw'T'T i?r -nr tvtv\ 

LLMGNRRHSLSPEEYIFGALNIYLDIIYIFTFFLQLFGTNRE 


6510 


37 


1156 


pcaldgcpqkgavhpllssamgLUflktqfvlhllvgfvfws 

GLVINFVQLCTLALW PVS KQLYRRLNCRLAYS LWS QLVMLLEW W 

sctectlftdqatverfgkbhaviilnhnfeidflcgwtmcerf 
gvlgsskvlajckbllyvpligwtwyfleivfckrkweedrdtw 
eglrrlsdypeymwfllycegtrftetkhrvsmevaaakglpvl 
kyhllprtkgfttavkclrgtvaavydvtlnfrgnknpsllgil 
ygkkyeadmcvrrfpledipldekeaaqwlhklyqekdalqeiy 

NQKGMFPGEOFfCPAPPPWTT.T.NlTT QURTTT t CM C>Dt?1rT rarentn 

GS PLLI LTFLG FVG AGNGHCR 


6511 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPLKGGK 
TMATNFSDIVKQGYVKMKSRKLGIYRRCWLVFRKSSSKGPQRLB 
KYPD2KS VCLRGCPKVTE ISNVKCVTRLPKETKRQAVAI I FTDD 
, oruvi r A^woajj]i/uiiswii\.iijSVfcCI^bRIjNDISLGEPDLLAPGV 
QCEQTDRFNVFLLPCPNLDVYGECKLQITHENIYLWDIHNPRVK 
L VS WPLCS LRR YGRDATR FTFEAGRMCDAGEGL YTFQTQ EG EQ I 
YQRVHSATLAIAEQKKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
MLPRSAYWHH ITGSQNI AE AS S YAG EGYGAAQAS SETDLLNRF I 
LLKPKPSQGDSSEAKTPSQ 


6512 


159 


807 


FG KKS TWF P1»5RSLRVASGRSCKLGHGG YTGSGPGFGE PRDSGA 
EVPSGSGRATGCERGGVRGARQGRAPGSSIWRKEPRMVCTRKTK 
TLVSTCVILSGMTNIICLLYVGWVTNYIASVYVRGQEPAPDKKL 
EEDKGDTLKIIERLDHLENVIKQHIQEAPAKPEEAEAEPFTDSS 
LFAHWGQELSPEGRRVALKQFQYYGYNAYLSLRLPLDRP 


6513 




756 


r vat-L r\jt oiiAgjjNLIWQLlLrl KyjjVMSi'AEGQDQGSAYANRTA 

LFPDLLAQGNASLRLQRVRVADEGSFTCFVSIRDFGSAAVSLQV 

AAPYSKPSMTIiEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQ 

GVPLTGNVTTSQMANEQGLFDVHS I LRWLGANGTYSCLVRNPV 

LQQDAHSS VTI T PQRS PTG AVE VQ VP EDP WALVGTDATLRCS F 

S PE PG FS LAQLNL I WQLTDTKQLVHS FAEGQDQGS AYANR7ALF 

PDLLAQGNASLRLORVRVADPnQPTPPVCTtjntv-'eniiiTOT * 
. «««fiywiwi^ u jujy^ y ft v Aua^jo r i\~r VolKJJrvjSAAVSliQVAA 

P YS K P S MTLEPNKDLRPGDT VT ITCS S YQG Y PEAE VFWQDGQG V 

PLTGNVTTSQMANEQGL FD VHS I LR WLGANGT YS CLVRN PVLQ 

QDAHS S VTI TPQRS PTGAVEVQVPEDP WALVGTDATLRCS FS P 

EPGFSLAQLNLIWQLTDTRQLVHSFTEGR 


6514 


985 


302 


vgipgptissaaemedlldioeelryslatsrakIMgrraqqesa 

QAENHLNGKNSSLTLTGETSSAKLPRCRQGGWAGDSVKASKFRR 

KASEEIEDFRLRPQSLNGSDYGGDIPIIPDLEEVQEEDFVLQVA 
APPSIQi:<RVMT v RDLDNDLMKYSATnTr.iYiPTnT vr t Ttnrr nn 

EHEVRERNPSWQDDVGWDWDIILFTBVSSEVLTEWDPLQTEKEDP 
AGQARHT 


6515 


1345 


305 


GRVGS RR RGAAVPGG CGAGS TQLE VSAS ASCGALGS ADMNP I W 
VHGGGAGP I SKDRKERVHQGMVRAATVGYGILREGGSAVDAVEG 
AWALE DDPE FNAGCGSVLNTNGEVEMDAS IMDGKDLS AG AVS A 
VQCIANPI KLARLVMEKTPHCFLTDQGAAQFAAAMGVPEI PGEK 
LVTBRNKXRLE KEKHE KG AQKTDCQKNLGTVGAVALDCKGNVAY 
AT3TGG I VNKMVGRVGDS PCLG AGGYADND I GA VSTTGHGES I L 
KVNLARLTLFHIEQGXTVEEAADLSLGYMKSRVKGLGGLIVVSK 
TGDWVAKWTSTSMPWAAAKDGICLHFGIDPDDTTITDLP 


6516 


1 


1402 


t-RRLRYLGQDATAAARDLRTRGLQGVCPSATARQQVLVSALQQL ' 
KGRRSEHRNENQEMPYSTNKELILGIMVGTAGISLLLLWYHKVR 
KPGI AMKLPEFLS LGNTFNS I TLQDB I HDDQGTTVI FQERQLQ I 
LEKLNELLTNMEELKEEIRFLKEAIPKLEEYIQDELGGKITVHK 
ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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[ SEQ 
ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
PaProline, Q=Glut amine, RsArginine, 
S=Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknovn, +«stop 
Codon, /..possible nucleotide deletion, 
\-possible nucleotide insertion) 








FP VPKAFNTR VE ELNLD VLLQKVDHLRMSE SG KS ES F EL LRDH K 
EXFRDEIEFMWRFARAYGDMYELSTNTQEKKHYANIGKTLSERA 
INRAPMNGHCHLWYAVLCG YVSEFEGLQNKINYGHLFKEIHLD I A 
IKLLPEEPFLYYLKGRYCYTVSKLSWIEKKMAATLFGKIFSSTV 
QBALHNFLKAEELCPGYSNPNYMYLAKCYTDLEENQKALKFCNL 
ALLLPTVTKEDKEAQKEMQKIMTSLKR 


6517 


3 


1414 


GRVWGGS S 5 LNAMVYVRGHAEDYFRWnRre&prMnvnur'r nvw — 
' " * w c*2v«yRyvjMKoWU XAHCIjPYFR 

KAQGHELGASRYRGADGPLRVSRGKTNHPLHCAFLEATQQAGYP 

LTEDMNGFQQEG FGWMDMT IHEGKRWSAACAYTjHPAIiS RTNLKA 

EAErLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVILSGGAIN 

SPQLL^ILSGIG^IADDLKKLGIPVVCHLPGVGQNLQDHLEIYIQQ 

ACTRP I TLHS AQKPLRK VC IGL EWLWKFTGEGATAHLE TGG FIR 

SQPGVPHPDIQFHFLPSQVIDHGRVPTQQEAYQVHVGPMRGTSV 
GWLKLRSANPODHPVIOPNYLSTPTDTPnPTiT cuvy -rooTDTmr. 

ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAWDPQTRVLGVENLRWDASIMPSMVSGNI.NAPTIMIA 
EKAADI I KGQPALWDKDVPVY KPRTLATQR 


6518 


242 


1098 


PAWNPGSEPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 

RHRCRRRAPPPPSTMGDAGSBRSKAPSLPPRCPOGFWGSSKTKN 

LCSKCFADFQKKQPDDDSAPSTSNSQSDLFSEETTSDNNNTSIT 

TPTLSPSQQPIiPTEIiNVTSPSKEECGPCTDTAHVSLITPTKRSC 

GTDSQSENEASPVKRPRLLENTERSEETSRSKQKSRRRCFQCQT 
KLEXiV00ELGSCRC^YVPr , MTiMPT.Dirriunr»'rt7rkii«r»rij-»-r»'i-»-riik •*■*« 

KMVKLDRKVGRSCQRIGEGCS 


6519 
r~6520 


3 


1113 


ISKKMAKPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKEKEREKKICHK 
VMNEIKKENGEVKILLKSGKEKPKTMIEDLQI KKVKKKKKKKHK 

ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSBFPFVSLKPPttVlTMTsiT rwror>vr\r t 
HIEHQPNGGASVrHCLQ 




3 


1113 


hKKMAEPPSPVHCVAAAAPTATVSEKEPFGKIiQIjSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
HSFAPl,SAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLIi 
VPPTIiLHAQPHHLLLPAAAAAASANAXSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKI KDKI KERDKEKEREKKKHK 
VMNE I KKENGE VK I LLKSGKEKPKTN I E DLQ I KKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVBDQAAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
HIEHQPNGGASVIHCLQ 


6521 " 
6S22 


ida 

1042 


1798 

391 ] 


IOjFKKATDTSQGELVHPKALPLIVGAQLIHADKLGEKVSDSTMP 
IRRTVNSTRETPPKSKLAEGEEEKPBPDISSEESVSTVEEQENE 

tppatsseaeqpkgepeneekeenksseetxkdekdqskekekk 

VKKTIPSWATLSASQLARAQKQTPMASSPRPKMDAILTEAIKAC 

fqksgaswairkyiihkypslelerrgyllkqalkrelnrgvi 

KQVKGKGASGSFVWQXSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPIAFTRLCEPKEASYSLIRKYVSQYYPKLRVDIRPQLIiKNA 
LQRAVERGQLEQITGKGASGTFQLKiCSGEKPLLGGSLMEYAILS 
AIAAMNBPKTCSTTALKKYVLBNHPGTNSNYQMHLLKKTLQKCB 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
DEDESSEEDSEDEEPPPKRRLQKKTPAKSPGKAASVKQRGSKPA 
PKVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 

NIKWI^PSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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Ammo acid segment containing signal peptide 
(A=Alanine, CoCysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, Iolsoleucine , K=Lysine, 
L=*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine f T=Threonine, V»Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknovn, *«stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








ECLD Y YGMLS LHRM FE WG GQ LTECJE LE LLAFLLDE APGAAGGL ~ 
SRARSGLKLLLBIiERRGOCDESNLRLLGQLLRVLARHDLLPHLA 
RKRRRPVSPERYSYGT5SSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


16^7 


ASCQTRRRTAALDSGERIAGRRSPIAIAMASNFNDIVKQGYVKI 
RSRKLG2FRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNI TRLPRETKKHAVAI I FHDETS KT FACES ELEAEEWC 
KHLCME CLGTR LND I S LGE PDLLAAGVQREQNER FNVYLM PT PN 
IiDIYGECTMQITHENIYLWDIHNAKVKLVMWPLSSLRRYGRDST 
V7 FT FES GRMCDTGEGLFTFQTREG EM I YQ K VHS ATLAI AEQHER 
LMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6524 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIAIiAMASNFNDIVKQGYVKI 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHN I KNI TRL PRE TKKHAYA I I FHDETS KT FACE S E LE AEEW C 
KHLCMECLGTRLNDISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LDIYGECTMQITHENIYLWDIHNAKVXLVMWPLSSLRRYGRDST 
WFTFESGRMCDTCEGLFTFG^EGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTSLTE PMTLS KS I S LPRSAYWHH I TRQNS VGE 

IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6525 


1 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS ~ 

PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 

KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 

NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 

SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 

EQYLTPLQQKEVTVRHLKTKLKESERRLHERESEIVELKSQLAR 

MREDW I EEE CHR VEAQLALKEARKE I KQLKQ VI ETMRS S LADKD 

KGIQKYFVDINIQNKKLESLIjQSMEMAHSGSLRDELCLDFPCDS 

PE KS LTLNPPLDTMADGLS LEEQVTGEGADRELLVGDS I ANS TD 

LFDEI VTATTTESGDLELVHSTPGANVLELLP I VMGQEEGS VW 

ERAVQTDWP YS P A I S E L I QS VLQKLQDPCPS S LAS PDE S E PDS 

MESFPESLSALWDLTPRNPNSAILLSPVETPYANVDAEVHANR 

LMR ELD FAACVEERLDG VI PLARGG WRQYWSS S FL VDLLAVAA 

P WPT VLWAFS TQRGGTD PVYNIGALLRGCCWALHSLRRTAFR 

IKT 


6526 


2 


2034 


SGRAGEPEE WRGRQI IDS KETWI P FNSEDSQQIiEEAYSSGKGCN 
GRWPTDGGRYDVHLGERWRYAVYWDELASEVRRCTWFY1CGDKD 
NKYVP YS ES FS Q VLEET YMLAVTLDE WKKKLES PNRE III LHN P 
KLMVHYQPVAGSDDWGSTPMSQGRPRTVKRGVENISVDIHCGEP 
LQIDHLVFWHGIGPACDLRFRSIVQCVNDFRSVSLNLLQTHFK 
KAQENQQIGRVEFLPVNWHSPLHSTGVDVDLQRITLPSINRLRH 
FTNDTI LDVFFYNSPTYCQTIVDTVAS BMNRI YTLFLQRNPDFK 
GGVS I AGHSLG S L I LFDI LTNQKDSLGD I DS EKGSLNI VMDQGD 
TPTLEEDLKKLQLSEFFDIFBKEKVDKEALALCTDRDLQEIGIP 
LA3irR.A.n.±LjM ir oi KAHaNbl itRPAPUPASGANI PKESEFCSSSN 
TRNGDYLDVGIGQVSVKYPRLIYKPEIFFAFGSPIGMFLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKGRKRMHLELREGLTRMSMDLKNNLLGSLRMAWKSFTRAPY 
PALQASETPEETEAEPESTSEKPSD VNTEETSVAVKEEVLP I NV 
GMLNGGQRI DYVLQEKPI ES FNEYLFALQSHLCYWESEDTVLLV 
LKEIYQTQGIFLDQPLQ 


6527 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRLDTTLIDFTDMKCQRGDLS 
FI FNGDAAPS ES FWLDNEQKVYQR I HHEESEME TEEE VD I LMS 
SDIYSATLSTKSISFTRAQTGWLFREDKTERVGNFLADFYLVNG 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H«Histidine, I»Isoleucine KtaLvsin** 
L« Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S^Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








LVLESRKRREHLSBEDILRNKAIMESLSKGGNIMEQNPEPIRRQ 

SLTPPPQNTITWSEYISAENGKAPHLGRELVCKESKKTPKATIA 
MSOEFPLGIELLLNVLEWAPPKHPNKTiRPPVrjMTfr odt •cnx/wt 

DI PVFPTITATVTFQEFRYDSFDGS I FTIPDDYKEDPSRFPDL 


6528 


1 


1073 


LTGPAAAEPRCA^AGMKRAIXSRRKGVWLRLRKI LFCVLGL^IA ' 
I P PL I KLCPG I QAKL I FLNFVRVP Y F I DLKKPQDQGLNHTCN YY 
LQPEEDVTIGVWKTVPAVWWKNAQGKDQMWYEDALASSHPI ILY 
LHGNAGTRGODHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE 
RGMTY DALHV FDW I KARSGDNP VYI WGHS LGTGVATNLVRRLCE 
RET P P DAL I LES ? FTNI REEAKSH P FS VI YR Y FPGFDWFFLDP I 
rSSGIKFANDENVKHISCPLLILHAEDDPWPFQLGRKLYS IAA 
PARS FRDFKVQ FV PFHS DLG Y RHKY I YKS PEL PR I LREFLG KS E 
PEHQH 


6529 


363 


2215 


TH I RYNK IGWKTMSCGNEF VE XL KK IG Y PKADNLNGED FDWLF 
EGVBDESFLKWFCGNVNEQNVLSERELEAFS ILQKSGKPI LEGA 
ALDEALKTCKTSDLKTPRLDDKELEKLEDEVOTLLKLKNLKIQR 
RNKCQIiMASVTSHKSLRLNAKEEEATKKLKQSQGIIiNAMITKIS 
NELQALTDEVTQLMMFFRHSNLGQGTNPLVFLSQFSLEKYLSQE 
BQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPSICD 
NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 
ESLHSLTSKAVDKENLDAKISSLTSEIMKLEKEVTQIKDRSLPA 
WRENAQLLKMPWKGDFDLQIAKQDYYTARQELVLNQLIKQKA 
SFELLQLSYEIELRKHRDIYRQLENLVQELSQ5NMMLYKQLEML 
TDP S VS QQINPRNT I DTKDYS THRLYQ VLEGENKKKELFLTHGN 
LEEVAEKLKQNISLVQDQIiAVSAQEHSFFLSKRNKDVDMLCDTL 
YQGGNQLLLSDQELTEQFHKVESQLNKLNHLLTDILADVKTKRK 
TLAKNKLHQMEREFYVYFLKDEDYLKDI VENLETQS KI KAVS L2 
D 


6530 


128 


2986 


GAAHHGAI VQVHP LLPG5 S TI M I HDLCLVF PAPAKAWYVS D I Q 
ELY I R WDKVE I GKT VKAYVR VLDLHKKP FLAK YFP FMDLKLRA 
ASPIITLVALDEALDMYTITFLIRGVAIGQTSLTASVTNKAGQR 
INS APQQ I E VF P P FRLMPRKVTLLIGATMQ VTSEGG PQPQSN I L 
FS ISNES VAL VS AAG L VQGLAI GNGT VS GLVQAVD AETGKW 1 1 
SQDLVQ VEVLLLRAVRIRAP Z MRMRTGTQMPI YVTG ITNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEAS IRLPSQYtfFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNPEI EAEQILMSPNS Y I KLQTNRDGAASLS YRVLDGPEKVPV 
VHVDEKGFLASGSMIGTSTI EVI AQEP FGANQT 1 1 VAVKVS PVS 
YLRVSM S ? VLHTQNKEALVAVPLGMTVTFTVHFHDNS GDVFHAH 

SSVLNFATMTJDDPVOTOKrtPTMWTmn/D TT/GimT ty t offfitnxvu 
w»uinni«iu;w »Wiv r\U r iniMlLVVKl Vo VUJbl LiijRVWDAKH 

PGLSDFMPLPVLQAISPELSGAMVVGDVLCLATVLTSLEGLSGT 
WSSSANS I LHIDPKTGVAVARAVGS VTVYYE VAGHLRTYKE VW 
SVPQRIMARHLHPIQTSFQEATASKVIVAVGDRSSNLRGECTPT 
QRE VI QALH PET LI SCQS QFKPAVFD FP SQD VFTVEPQ FDTALG 
QYFCSITMHRLTDKQRKHLSMKKTALWSASLSSSHFSTEQVGA 
EVPFSPGLFADQAEILLSNHYTSSEIRVFGAPEVLENLEVKSGS 
PAVLAFAKE KS FG WPS F I TYTVGVLDPAAGS QGPLSTTLTFS S P 
VTNQAIAI PVTVAFVVDRRGPGPYGASLFQHFLDS YQVMFFTLF 
ALLAGTAVMI IA YHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
SPTSPNALPPARKASPPSGLWSPAYASH 


6*31 " 
6532 


845" 
2 


1425 
" ' 954 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
SLCMVITIYYDVKVRRIVRGCGQYISYRCQEKRNTYFAEYWYQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLSLPLPNFHAGTEPDGLDPMVTLS LNLGLS FAJBLRRMYLFLNS 
SGLLVLPQAGLLTPHPS 

AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLlkGVAPPTL 
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Amino acid segment containing signal peptide 
(Alanine, C=Cysteine, D-Aspartic Acid, Eo 
Glutamic Acid, F«= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, [^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, VaValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TDS TGTHL VLTVTN KNAHS PGLS RG S PQQPS S Q PGS P AP APS A 
QMDLEHPLQPL FGTPTS LLKKBP PGYE EAMSQQ P KQQENGS S SQ 
QMDDLFDILIQSGEISADFKEPPSLPGKEKPSPKTVCWSPLAAQ 
PS PSAELPQAAP PPPGS PSLPGRLBDFLESSTGLPLLTSGHDGP 
EPLSLIDDLHSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 
DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLPSTDFLDGHD 
IiQLHWDSCZi 


6533 


1798 


373 


STISWLARVEPPRRSSGVGAARLRFPGGSRPLRARACVLALAVL 
ALLERJNNADSMSAHSMLCERIAIAKELIKRAESLSRSRKGGIEG 
GAKLCS KLKAELKFLQKVEAGKVAI KESHLQSTNLTHLRAI VES 
AENLEEWSVLHVFGYTDTLGEKQTLWDWANGGHTWVKAIGR 
KAEALHNIWLGRGQYGDKSIIEQAEDPLQASHQQPVQySNPHn 
FAF YNS VSSPMAEKLKEMG1 S VRGD I VAVNALLDHPEELQPS ES 
E SDDEG PE LLQ VTR VDREN I LAS VAF PTE I KVDVCKR VNLD I TT 
LITYVSALSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQS I LDTLGGPGERERATVL I KR INWPDQPS 
ERALRLVAS S KINS R S LT I FGTGDTL KAITMT ANSGFVRAANNQ 
GVKFSVFIHQPRALTESKEALATPLPKDYTrDSEH 


6534 


47 


596 


katrfisaafwlnkqgv^paKlphtsWswslqtlsflfsgdla 
ekslqcfpcsamllelipllgihfvlrtaraqsvtqpdihitvs 
egaslelrcnysygatpylfwmertveeafillvclkpwrvass 
lekke ke d e s fqll lgsrynvlkahcllpli rw ltsgds lls aq 

PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNLETLLTlJVbljEIDKAFSSHARLS - " 
ADATLLTSGTTATVALLRDG I EL WAS VGDSRAILCRKGKPMKL 
T I DHTP E RKOE KE R I KKCGG FVAWNSLGQPHVNGRLAMTRS I GD 
LDLKTSGVIAEPETKRIKLHHADDSPLVLTTDGINFMVNSQEIW 
D FVNQCHD PNEAAHAVTEQA I QYGTEDNSTAVWPFGAWG KYKN 
SEINFSFSRSFASSGRWA 


6536 


242 


1174 


SLVKEMTNQYGILFKQEQAHDDAIWSVAWGTNKKENSETWTGS " 
L DDLVKVW KWRDERLDLQWSL EGHQLGWS VDI SHTL P I AAS S S 
LDAHI RLWDLENGKQ I KS I DAGPVDAWTLAFS PDSQ Y LATGTHV 
GKVNI FG VESGKKE YS LDTRGKF I LS IAYS PDGKYLASG AIDG I 
INIFDIATGKLLHTLEGHAMPIRSLTPSPDSQLLVTASDDGYIK 
I YDVQHANLAGTLSGHAS W VLNVAFCPDDTHFVSS S S DKS VKVW 
DVGTRTC VHTF FDHQDQVWGVKYNGNGS KI VS VGDDQE IH I YD C 
PI 


6537 


1638 


921 


NRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLLDR^ 
FTTYKLMHTHQTVDFVRSKHAQFGGFSYKKMTVMEAVDLLDGLV 
DESDPD VDFPNS FHAFQTAEG I R KAHP DKD W FHLVGLLHDLGKV 
LALFGEPQWAWGDTPPVGCRPQASWFCDSTFQDNPDLODPRY 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTLS PQSTCTR 


6538 


3345 


2412 


PYLYDFLDAL I TCQTAPEEAFI KLDGLAGMLTEQliRRLTKQ VQE 
ARHNRDDEAI KKAVNE YDETMEKYI PVLMAQAKI Y WNLENYPM V 
E K I FRKS VE FCNDHD VWKLNVAHVT.PMOPN Y VTf P A T r* p v i? d tt r v 

iu^yDWIIiNVSAIVLANLCVSYIMTSQNEKAEELMRKIEKEEEQL 
S YDDPNRKMYHLC I VNLVIGTLYCAKGNYBFGI S RVI KS LE P YN 
KKLGTDTWY YAKRC FL S LLENMS KHMI VIHDS VI QECVQ FLGHC 

ELYGTNI PAVIEQPLEEERMHVGICNTVTDESRQLKALI YE IIGW 
NK 


6539 


218 


339 


FLGAASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV 


6540 " " 


3 


391 


LERLWLLLLRR P ED AMAEC PTLG EAVTDH PDRL W AWE K F VYLDE 
KQHAWLPLT I EI KDRLQLRVLLRRED WLGRPMTPTQ IGPSLLP 
I MWQL YPDGR YR SSDS S FWRL VYH I K I DGVBDMLLE LLP DD 
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Amino acid segment containing signal peptide 
(AwAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\3p0Qsible nucleotide insertion) 


6541 


1165 


536 


RTLVQRRI LMLLR K P ARGRDLRGRGRGT PRGGRKGLL PTPD E F P 
RFEGGRKPDSWDGNREPGPGHEHFRDTPRPDHPPHDGHSPASRE 
RSSSLQGMDMASLPPRKRPWHDGPGTSBHREMEAPGGPSEDRGG 
KGRGG PGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGS GTP 
S RGGRSGS NWGRGSNMNSG PP RRGAS RGGGRGR 


£542 


3 


3775 


S W PRGRGETGGH PGALRTRTM Q KS VR YNEGHALY IAFLARK EGT 
KRGFhS KKTAEAS R WHEKWFAL YQNVLFY FEGEQS CRPAGM YLL 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIERBVLMQKYIHLVQIVET 
E K I AANQLRHQLEDQDTE I ERL KSE 1 I ALNKTKERMRP YQSNQ E 
DEDPDIKKIKKVQSFMRGWLCRRKWKTIVQDYICSPHAESMRKR 
NQ I VFTMVEAES E YVHQLY ILVNGFLR PLRMAASS KKP P I SHOD 
VSS I FLNSETI MFLHE I FHQGLKAR I ANWPTLILADLFDI LLPM 
LNI YQ E PVRNHQ YS fcQ VLANCKQNRDFDKLIjKQ YEANPACEGRM 
LETFLTYPMFQIPRYItTIiHELIAHTPHEHVERKSr^FAKSKLE 
ELS RVMHD EVS DT EN IRKNLAI ERM I VEGCD I LLDTSQT FI RQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGGKLHLLKTC5GVLSI»IDCTLIEEPDASDDDSKGSGQVFGHXjDF 
KIWEPPDRAAFTVVLLAPSRQEKAAWMSDISQCVDNIRCNGLM 
TI VFEENS KVTVPHMIKSDARLHKDDTDICFS KTLNSCKVPQI R 
YASVERLLERLTDLRFLSIDFLNTFLHTYRIFTTAAWIiGKLSD 
IYKRPFTS I P VRSLELFFATSQNNRGEHLVDG KS PRLCRKFSS P 
PPLAVSRTS S P VRARKLSLTSPLNS KIGALDLTTSSS PTTTTQS 
PAASPPPHTGQI PLDIiSRGLSS PEQS PGTVEENVDNPRVDLCNK 
LKRS IQKAVLES APADRAGVESSPAADTTELS P CRS PSTPRHLR 
YRQ PGGQTADNAHCS VS PASAFAX ATAAAGHGS PPG FNNTERTC 
DKEFI IRRTATNRVLNVLRHWVSKHAQDFELNNELKMNVLNLLE 
E VLRDPDLLPQERKAAAN I LMALSQ DDQDDI HIiKI*EDI I QMTDC 
MKAE CFES L SAME LAEQ I TLLDHV I FRS I PYE E FIX3QGWM KLDK 
NERTPYIMKTSQHFNDMSNLVASQIMNYADVSSRANAIEKWVAV 
AD I CRCLHNYNGVLE I TS ALNRSAI YRLKKTWAKVSKQTKALMD 
KLQKTVSSEGRFKNLRETLKNCNPPAVPYLGMYLTDLAFIEEGT 
PNFTEEGLVNFSKMRMISHIIREIRQFQQTSYRIDHQPKVAQYL 
LDKDLI IDEDTLYELSLKI EPRLPA 


6543 


1857 


950 


FVSGCGRAGIGLSWAMAAEARVSRWYFGGLASCGAACCTHPLDL 
LKVHI^TQQEVKLRMTGMAIiRVVRTDGILALYSGLSASLCRQMT 
YS LTRFAI YETVRDRVAKGS QGPLP FHEiCVLLGS VSGLAGG F VG 
TPADLVNVRMQNDVKLPQGQRRNYAHALDGLYRVAREEGLRRLF 
SGATMASSRGAbVTVGQLSCYDQAKQLVLSTGYLSDNIFTHFVA 
SFIAGGCATFLCQPLDVLKTRLMNSKGEYQGVFHCAVETAKLGP 
LAFYKGLVPAGIRLIPHTVLTFVFLEQLRKNFGIKVPS 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTD P E PRTLN PS P AGWFVQQHPELE LMS S FR ERFGRNWLQ Y 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQES PQ KMSEE VRAEPQEEEEEKEGKEEKEEGEMAPLPEAHLG 
EGKQKECP ' 




176 


560 


pphshaallpaamtplltlilwlmglplaOaldchvcayngdn'"" 

CFNPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVPRCFETVYDGY 
S KHASTTS CCQ YDLCNGTGLATPATLALAP IIjLATLWGLL 


6546 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKCNS 
S PG VL KVLAQLGLG FS CANKAEMEIjVQHI G I FASK 1 1 CANP CKQ 
IAQI KYAAKHG IQLLSFDNEMELAKWKSHPS AKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHIiLENAKKHHVEWGVSFHIGSGCPD 
PQAYAQSIADARLVFEMGTELGHKMHVLDLGGGFPGTEGAKVRF 
EE I AS VINSALDLYFPEGCGVDI FAELGRY YVTSAFTVAVS I IA 
KKE VLLDQPGREEENGSTS KT 1 VYHLDEGVYG I FNS Vh FDJttCP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H*=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SsSerine, T=Threonina, V-Valine, 
W=.Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








tpilqkkpsteoplyssslwgpavdgcdcvaeglwlpqlhVgdw ' 

LVFDNMGAYTVGMGS PFWGTQACHITYAMS RVAWEALRRQLMAA 
EQEDD VEG VC KPLS CGWEI TDTLC VGP VFTPAS IM 


6547 


1 


541 


liHSKYLAPALCSQPGMMRCCRRRCCCRQPPHALRPLLLLPLVLL 
PPLAAAAAGPNRCDTIYQGFAECLIRLGDSMGRGGELETICRSW 
NDFHACASQVLSGCPEEAAAWESLCJQEARQAPRPNNLHTLCGA 
P VH VRERGTGS ETNQETLRATAPAL PMAPAPPLLAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVP1ETDAEDG 
I KG CGI TFTLG KG TE VGELKI LSRFQNA 


" £549 


73 


1490 


ETGRVCEDARPACGSRS RRRRKEAAPGI PTPS PSS SS PTSSRPA 
ARAFSKAPARLSRPRAREEPPDPGRRYIQEEIIQARKHKLIKMC 
SSVAAKLWFLTDRR1REDYPQKE1LRALKAKCCEEELDFRAWM 
D EWLT I EQGNLGLR INGEL I TAY PQ WWRVPT P WVQS DS D I T 
VLRHLEKMGCRLMNRPQAILNCVNKFWTFQELAGHGVPLPDTFS 
YGGHBMFAKMIDEAEVLEFPMWKKTRGHRGKAVFI^DKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGGVGMNCSLSEQGKQIAIQVSNILGMDVCGIDLL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
R LTRRMSLLS WSTAS ETS E PELG P PASTAVDNMS AS SS S VDSD 
PBS TERELLTKLPGGLFNMNQLLANB I KLLVD 


6550 


2293 


922 


FRVSRI^PDCXSIEOMGliAMEHGGSYARAGGSSRGCWYYLR^PF - 

LFVSLIQFL I ILGLVLFMVYGNVHVS TESNLQATERRAEGLYSQ 

LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDRINASFR 

QCQGDRVIYTNNQRYMAAIILSEKQCRDQFKDMNKSCDALLFML 

NQKVKTLEVEIAKEKTICTKDKESVLLNKRVAEEQLVECVKTRE 

LQHQERQLAKEQLQKVOALCLPLDKDKFEMDLRNLWRDS 1 1 PRS 

LDNI^YNLYHPUSSELASIRRACDHMPSLMSSKVEELARSLRAD 

I ER VARENS DLQRQKLEAQQGLRAS QEAKQKVE K E AQAREAKLQ 

AECSRQTQLALEEKAVLRKERDNLAKELEEKKREAEQLRMELAI 

RNS ALDTC I KTKSQPMMP VS RPMGP VPNPQP I D PASIiEEFKRKI 

LESQRPPAGIPVAPSSG 


6551 


157 


74B 


I QP PDPRNMTLAAYKE KM KE LPLVS L F CSCFLADPLNKS S YK YE 
ADT VDLNWC V I SDMEV I ELNKCTSGQS FE VI LKP P S FDG VP E FN 
ASLPRRRDPSLEEIQKKLEAABERRKYOEAELLKHIiAEKREHER 
EVIQKAIEEimNFIKMAKEKLAQKMESNKENREAHIiAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


I QP P DPRNMTLAA YKE KMKEL PLVS LFCS CFLADP LNKS S YKYE 
ADTVDLN WCVI S DME VIBLNKCTSGQS FE VI LKP P S FDGVPE FN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELIiKHLAEKREHER 
EVIQKAIEENNNFIKMAKEKLAQKMESNXENREAHLAA>ILERI^ 
EKDKHAEEVRKNKELKEEASR 


65S3 


2 


1807 


F VWS KMAAHLS YGR VNLNVLR E AVRRELREFLDKCAGS KAI VWD 
E YLTG P FGLIAQ YS LLKEHEVE KMFTLKGNR LPAAD VKN 1 1 FFV 
RPRLELMDIIAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
WSVLGSFIHREBYSLDLIPFDGDIjLSMESEGAFKECYLEGDQTS 
LYHAAKGLMTLQALYGTIPQI FGKGECARQVANMMIRMKREFTG 
SQNSIFPVFDNLLLLDRNVDLLTPLATQLTYEGLIDEIYGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEBLYAEIRDKN 
FNAVGSVLSKKAKI ISAAFEERHNAKTVGE TKQFVSQLPHMQAA 
RGSLANHTSIAELI KDVTTSEDFFDKLTVEQEFMSG 1 DTDKVNN 
Y I EDC I AQKHS L 1 KVLRLVCLQS VCNSGLKQ KVLDY YKRE I LQT 
YGYEHILTLHNLEKAGLLKPQTGGRNNYPTIRKTLRLWMDDVNE 
QNPTDISYVYSGYAPLSVRLAQLLSRPGWRSIEEVLRILPGPHF 
EERQPLPTGLQKKRQPGENRVTLIFFLGGVTFAEIAALRFLSQL 
EDGGTEYVIATTKLMNGTSWIEALMEKPF 
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SEQ 
ID 
NO: 


IT icUlCLCU 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

residue of 

amino acid 

sequence 

— _ 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, RoArginine, 
S»Serine, T=Threonine, V*Valine, 
W*Tryptophan, Y-Tyrosine, X«Unknown, +-Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 


6554 




1244 


FEMGSQVSVESGALHWIVGGGFGGIAAASQLOALNVPFMLVDM 
KDSFHHNVAALRASVETGFAKKTFISYSVTFKDNFRQGLVVGID 
LKNQMVLLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDM VRQVQRSRFI WVGGGSAGVEMAAE I KTEYPEKEVTLIH 
SQVALADKEbLPSVRQEVXEILLRKGVQLLLSERVSNLEELPLN 
BYREYIKVQTDKGTEVATNLVILCTGIKINSSAYRKAFESRLAS 
SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYIAGLHANIAV 
AN I VNS VKQR PLQAYKPGA1»TFLLSMGRNDGVGQISGFYVGRLM 
VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


4 98 


IHMALLRKINQVLLFLLIVTLCVILYKKVHKGTVPKNDAbDESE" 
T PE ELEEE I P VVI CAAAGRMGATMAAINS I YSNTDANI L F YWG 
LRNTLTR I RKW I EHS KLRE I NFKI VE FNPKGLKGKI RPDS SRPE 
LLQPLNFVRFYliPLLIHQHEKVIYLDDDVIVQGDIQELYDTTIiA 
LGHAAAFSDDCDLPSAQDINRLVGIiQNTYMGYLDYRKKAIKDLG 
ISPSTCSFNPGVIVANMTEWKHQRITKQLEKWMQKNVEENLYSS 
SIjGGGVATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKliLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


0 DOD 


241 


1449 


ASLCKGCFFVTHVLVilLPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNS QGQLR VP WFVTNAGNI LQHS KAQELSALLG 
CEVDADQVILSHSPMKLFSEYHEKRMLVSGQGPVMENAQGLGFR 

nwtvdelrmafplldiwdlerrlkttplprndfpriegvlllg 

EP VRWETSLQIi I MDVLLSNG S PGAGLATPPYPHLPVLASNMDLL 
W^EAKMPRFGHGTFI^CIiETIYQKVTGKELRYEGLMGKPSILT 

yoyaedlirrqaerrgwaapirklyavgdnpmsdvyganlfhqy 
lqkathdgapblgaggtrqqqpsasqscisilvctgvynprnpq 
stepvlgggeppfhghrdlcfspglmeashwndvneavqlvfr 

KEG WALE 


6557 


2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN 
KS P QS N9 PVLLSRLHFEKDADSSER 1 1 APMRWGL VPS WFKES DP 
SKI^F^TTNCRSDTVMEKRSFKVPLGKGRRCVTOiADGFYEWQRC 
QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLLT 
MAGIFDCWEPPEGGDVLYSYTIITVDSCKGLSDIHHRMPAILDG 
EEAVS KWLDFGEVSTQEALKL I HPTENITFHAVSS VVNNSRNNT 
PECLAPVDLVVKKELRASGSSQRMLQWLATKSPKKEDSKTPQKE 

ESDVPQWSSQFLQKSPIiPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ 


6558 


21 


1138 


FHGRRRGGRiWELGSCLEGGREAAEEEGEPEVKitoliLCV^FAS" 
VASCDAAVAQC FLAE NDWEMERALNS Y FE PP VEE SALERRPE T I 
SEPKTYVDLTNEETTDSTTSKISPSEDTC3QENGSMFSLITWNID 
GLDLtJNLSERARGVCSYIALYSPDVIFLQEVIPPYYSYLKKRSS 
NYEI I TGHEEGYFTAIMLKKSRVKLKSQEI I PFPSTKMMRNLLC 
VHVNVSGNELCLMTS HLESTRGHAAERMNQLKMVLKKMQEAPES 
ATVIFAGDIWIiRDREVTROGGLPNNIVDVWEFLGKPKHCQYTWD 
TQMNSNLG I TAACKLR FDR I FFRAAAE EGH I I PRSLDLLGLEKL 
DCGRFPSDHWGLLCNLDIIL 


6559 


3 


364 


UPELSGLPTRPKKLKANQTPlAMDCCASRSCSVPTGPATTlCSS 
DKSCRCGVCLPSTCPHTVWLLEPTCCDNC!PPPr , PTTJnDr^7D»r/-»T? 
LLNSCQ PT PGLETLNLTTFTQ PCCE PCLPRG C 


6560 


3 


1435 


TATSGG I WiiRRKWRCHWPRPLPQS CVGTEGGLQ VRDTSSRX AKG 
GVDHTKMSLHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
IRNVTS PTRQHHVEREKDHSS S R PS S PRPQ KAS PNGS I S S AGNS 
S RNSSQS S S DGS CKTAG EMVF VYENAKEGARN I RTS E R VTLI VD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGE YE VAE 
G I G STVFRA I LD YY KTG I IRCPDG I S I PELRE ACDYLC I S FEYS 
XI KCRDLSALMHELSNDGARRQFEFYLEEMILPLMVASAQSGER 
BCHIWLTDDDWDWDEEYPPQMGEEYSQIIYSTKLYRFFKYIE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aepartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H«Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








NRDVAKSVLKERGLKKIRLGIEGYPTYKEKVKKRPGGRPBvTyN - 
* vynrr x kfisvv a tvxi av? fvo KHv iJfcyCVICSKSITNLAAAAADIPQD 
QLWMH PT?Q VDELD I LPIHP PSGNS DLD PDAQNPML 


6561 


3 


1086 


PGRRFRRKESSS SRW FPADCLLGLRGPASSLLS" pE PS PSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGPGNLL 
AKQLVDRGMQVLAACFTEEGSQKLQRDTSYRLQTTLLDVTKSES 
I KAAAQ VIVRDKVGEQGLWAIiVNNAGVGL PSGPNEV7LTKDD FVKV 
I>W^VGLI2VTI/HMIiPMVKRARGRVVNMSSSGGRVAVIGGGYC 
VSKFGVEAFSDS I RRELYYFGVKVCI I E PGN YRTAILGKENLES 
RMRKLWERLPQETRDSYGEDYFRIYTDKLKNIMQVARPRVRDVI 

NSMEHAI VSRS PRI RYNPGLDAKLLYI PLAKLPTPVTDFILSRY 
LPRPADSV 


6562 


1 


1562 

• 


MSTLYDI RAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
LYEWFLGKRS EGVPVS G PMLIEKAKDFYEQMQLTEPCVFSGGWL 
WRFKARHGIKKLDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGlQHIiPVAYKAQGNAWVDKEIFS 
DWFHHIPVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAI FS VACAWNAVP S HVFRRAWRKLWP SVAFAEG S S SEE E 
LEAECFPVKPHNKSFAH1LELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGRPPAATSPAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAAVAFDAVLR FAERQ P C FS AQ EVG Q LRALRAVFRSQ QQ VRRRR 
GALGAWKVEAbQEGPGGCGATAQSPLPCSSTAGDN 


6563 


1319 


2694 


LAR PAQP VL LRE PEGAG P P VPAGHLVHHLGGGHLRERAH P D£*E A 
HEHPLPCDQMFWRQMGGHLRMVE ANS RG WWG IG YDHTAWVYTG 
GYGGGCFQGLASSTSNIYTQSDVKCVHIYENQRWNPVTGYTSRG 
LPTDRYMWSDASGLQECTKAGTKPPSLQWAV7VSDWFVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 
E VP P XALRDVS 1 1 PESPGAEGSGHS IALWAVSDKGD VLCRLGVS 
E LNPAGS S W LHVGTDQPFAS IS I GACYQVWAVARDGSAF YRGS V 
YPSQPAGDCWYHIPSPPRQRLKQVSAGQT5VYALDENGNLWYRQ 
G I TPS YPQGS S WEHVSNNVCR VS VG PLDQ VW VIANK VQG SHSLS 
RGTVCHRTGVQPHE PKGHGWD YG I GGG WE H I S VRANATRAPRSS 
SQEQEPSAPPEAHG PVCC 


6564 


1 


9^ " 


APGSCAbWSYCGRGWSRAMRGCQLIjGLRSSWPGDI^SARLliSQE""" 
kj<aa£. I Ht\i FE TVS BEE KGG KVYQ VFES VAKK YDVMNDMMSLG I 
HRVWKDLLLWKMHPLPGTQLLDVAGGTGDIAFRFLNYVQSQHQR 
KQKRQLRAQQNLS WEE I AKE YQNEEDS LGGSRVWCD 2NKEMLK 
VGKQ KALAQG YRAGLAW VLGDAEEL PFDDDKFD I YTIA FG I RNV 
TH I DQALQEAHR VLKPGGR FkCLE FS QVNNPL I S R LYDL YS FQ V 
I P VLGEVIAGDWKS YQYLVES IRRFPSQEEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


999 


RSAVANGLTKRRMGL KLNGR Y I SL I LAVQ I AYLVQA VRAAG KCD 
AVFKGFSDCXLKLGDSMANYPQGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCOEGAKDMWDKLRKES KNLN IQGSLFELCGSGNGAAGS 
LLPAFP VLLVS LS AALATWLS F 


6566 " 


3 


1385 


KYESAQPGGl^PEPGI^ARMAIHKALVMCLGLPLFLFPGAWA^G 
HVPPGCSO^LNPLYYNLCDRSGAWGIVJUEAVAGAGIVTTFVLTI 
I LVASLP FVQDTKKRSLLGTQVFFLLGTLGLFCLVFACVE KPDF 
STCASRRFLFGVLFAICFSCLAAHVFALNFLARKNHGPRGV7VIF 
TVAUjLTLVE VI INTEWLI I TLVRGSGEGGPQGNSSAGWAVAS P 
CAIANWDFVMAL 1 Y VMLLLLGAFIiGAWPALCGR YKR WRKHG VFV 
LLTTATS VAI WWW I VM YTYGNKQHNS P TWDDPTIiAI ALAANAW 
AFVLFYVIPEVSQVTKSSPEQSYQGDMYPTRGVGYETILKEQKG 
QSMFVENKAFSMDEPVAAKRPVSPYSGYNGQLLTSVYQPTBMAL 
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SEQ 
i ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residus nf 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, CoCysteine, D«Aspartic Acid, B= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
KoHi9tidine, I«=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S»Serine, T=Threonine, V^Valine, 
WsTryptophan, YaTyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








MHKVPSEGAYDIILPRATANSQVMGSANSTLRAEDMYSAQSHQA 
AT P P KDGKNS QVFRNP YVWD 


6567 


125 


863 


TKRSNLKAYACS IHH IRTMS YVFVNDSSQTNVPLLQACI DGDFN 
YS KR LLESG FDPNI RDS RGRTGLHLAAARGNVD I CQLLH KFGAD 
LLATDYQGNTALHLCGHVDTIQPL VSNGLKID I CNHQGATPLVL 
AKRRGVNKDVI RLLESLEEQEVKG FNRGTHSKLETMQTAESBSA 
MBSHSLLNPNLQQGEGVLSSFRTTWQEFVEDLGFWRVLLLIEVI 
ALLSLGIAYYVSGVLPFVENQPELVH ! 


65S8 


3 


11B3 


HASDRLLVLPDNYSHFSQASANLQGPSRTTBLFHPTLASISSPM 
LEGAELYFNVDHGYIiEGLVRGCKASLLTQQDYINLVQCETLEDL 
KIHLQTTBYGNFLANHTNPLTVSKIDTEMRKRLCGEFEYFRNHS 
LE PLSTFLTYMTCS YMI DNVI LLMNGALQKKS VKEILG KCHPLG 
RFTEMEAVNIAETPSDLFNAILIETPLAPFFQDCMSEKALDELN 
IELLRNKLYKSYLEAFYKFCKNHGDVTAEVMCPILEFEADRRAF 
I ITLNSFGTELSKSDRBTLY PTFGKLY PEGLRLLAQAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVFYEREVQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKIN3YIPIL 


6569 


205 


1532 


RRRGP QRLGHGRPT PLL CRWRTAG PS H WE KQARAFQGLRP VD PR 
RMSWLFPLTKSASSS AAGS PGGLTSLQQQKQRLI ESLRNSHSS1 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKOGVYVTSPLVNNFTMHSDLGKIIQSLLDEFWKNPPVLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVSSSTTSHTTAKPAAPSFG VLSNLPLPIPTVDAS I PTSQNG 
FGYKMPDVPDAFPELSELSVSQLTDMNEQEEVLLEQFLTLPQLK 
QI ITDKDDLVKS IEELARKNLLLEPSLE AKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSESCSASALQARLKVAAHEAEEESDNIAED 
FLEG KME I DDFLS S FMEKRTI CHCR RAKEE KLQQ AI AMHSQ FHA 
PL 


£570 


330 


1304 


ARLPRLTFLREGFLYVIiLSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGS KALPAP I PLHPS LQLTNYS FLQAVNTFPATVDHLQG 
LYGLSAVQTMHMmiWTLGYPimiEITRSTITEMAAAQGT,VDAR? 
PFPALPFTTHLFHPKQGAIAHVIjPAIiH KDRPRFD FANLAVAATQ 
EDPP KMGDLSKLS PGLGS P I SGLSKLTPDRKPSRGRLPSKTKKE 
FICKFCGRHFTKSYNLLIHERTHTDERPYTCDICHKAFRRQDHL 
RDHRYIHSKEKPFKCQECGKGFCQSRTIAVHKTLHMQTSSPTAA 
SSAAKCSGETVICGGT 


4571 


169 


656 


APDMKRKKI^KLTDTIiTKNCKHLFRGFDKDNDGCVNVLEWIHGL 
SliFIiRGSIiSEKMKYCFEVFDLNGDGFISKEEMFHMLKNSLLKQP 
SEEDPDEG I KDLVE ITLKKMDHDHDGKLSFADYELAVREETLLL 
E AFGPCL PDPKSQME FEAQVFKDPNE FNDM 


6572 


49 




T P ERAQ PG ALLGAAG CG VCGGRW WPRS HERG YFS S AKMGSKRRN 
LSCSERHQKLVDBNYCKKLHVQALKNVNSQIRNQMVQN3NDNRV 
QRKQFLRLLQNEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELAKLKHESLKDEKMRQQVRENSIELRELEKKLKAAYMNKERAA 
QrAEKDAIKYEQMKRDAEIAKTMMEEHKRIIKEENAAEDKRNKA 
KAQ YYLDL EKOLE EOE KJCKOEA YE OLL KE ITT.M TTi Tf T VP V T VP i?n 

QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKII 
EFANMQQQREEDRr4AKVQENEEKRLQLQNALTQKI>EEMLRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEQMA 
liKBLVLQAAKEEEENFRKTMLAKFAEDDR I ELMNAQKQRMKQLE 
HRRAVEKLI EERRQQFLADKQRELE EWQLQQRRQGFI NA I IEEE 
RLKIjLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGGGESQSFRAQDGTRTPATDCLMYIiCXSPRlOiMTQGGYDMVQK' 
LFLDFFRRRLSQRP TAEE LEQRNI LKPRNEQE EQEEKRE I KRRL 
TRKLSQRPTVEELRERKrLIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ID 
NO: 


Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment mTih» i n-j nrr cirmai _ „ ». j j 
j c -* * ^-vjjiLcijLiiiiiy oignai pepcicle 

(A«Alanine, C-Cysteine, D«Aspartic Acid, E= 

Glutamic Acid, F=Phenylalanine, G=Glycine, 

H=Histidine, I*Isoleucine, K^Lysine, 

LoLeucine, M=Methionine, N=Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S=Serine, T=Threonine, V*Valine, 

^Tryptophan, Y«Tyrosine, X«Unknown, *-Stop 

Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 


6574 


204 


1159 


LTAADKV S RGEC WRVGGRTVCW VS IjGS PLGS V 

LESSVPVSVGVFWACGVSWTGAAGU3DGALSDTMARNAEKAMTA 

LARPRQAQLEEGKVJCERRPFIiASECTBLPKAEKWRRQI IGSIS K 

KVAQIQNAGLGEPRIRDLNDEINKLLREKGHWEVRIKELGGPDY 

GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 

PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 

KWKAERE ARLARGE KE EEEEE EEE INI YAVTEEES DE EGSQEKG 

GDDSQQKFIAHVPVPSQQEIEEALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 


6575 


117 


B20 


spaiasqsggiteekmlepqengvidLpdyehvedetfppfppp 

ASP ERQDGEGTE PDEE SGNGAPVPVPPKRTVKRN 1 P KLDAQRL 1 

SERGLPALRHVFDKAKFXGKGHEAEDLKMLIRHMEHWAHRLFPK 

LQFEDFIDRVEYLGSKKEVQTCLKRIRLDLPILHEDFVSNNDEV 

AENNEHDVTSTELDPFLTNLSESEMFASELSISLTEEQQQRIER 
NKQLALERRQAKLP 


6576 
6577 


1 


1060 


PEPQALVGQKKGALRLLVARLVLTVSAPAEVRRRVLRPVtSWKjD 
RETRALADSHFRGLGVDVPGVGQAPGRVAFVSEPGAFSYADFVR 
GFLLPNLPCVFSSAFTQGMGSRRRWVTPAGRPDFDHLLRTYGDV 
WPVANCGVQEYNSNPKEHMTLRDYITYHKEYIQAGYSSPRGCL 
YL KDWHLCRD FP VED VFTL PVYFS S DWLNE FWDALDVDD YR FVY 
AG PAGSWS P FHADIFRS FS WSVNVCGRKKWLLFPPGOEEALRDR 
HGNLPYDVTSPALCDTHLHPRNQLAGPPLEITQEAGEMVFVPSG 

WHHQVHNLVMCCFSCPLSGAFLQEDGSTTSPLSQPELGWNGVAH 
G 


6578 


2271 


987 


^DRMASDti^DlVIEAMLEAPYKJkEEDEQQRKEVKKDYPSNTTSS 
TSNS GNETSGS S TIGBTSNRSRDRDRY RRRNS RSRSPGRQ CRHR 

SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
REKS P VRE P VDNLS P EERDART VFCMQLAARI RPRDLEDF FS AV 
GKVRDVRIISDRNSRRSKGIAYVEFCEIQSVPLAIGLTGQRLLG 
VPI I VQASOAEKNRIjAAWAlflNLQXGNGGPMRL YVGSLHFNI TED 
MLRGIFEPFGKIDNIVLMKDSDTGRSKGYGFITFSDSECARRAL 
EQLNGFELAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAGIQLPSTAAAAAAAAAAQAAALQLNGAVPLGA 
LNPAALTALSPALNLASQCLQLSSLFTPQTM 




377 


1489 


PSSSATMNI^PLKRATIIjHMALIXSASDPSAEAEANGEKPFLLRA 
LQIALWSLYWVTSISMVFLNKYLLDSPSLRLDTPIFVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFIG 
MITFNNLCLKYVGVAFYNVGRSLTTVFNVLLSYLLLKQTTSFYA 
LLTCGI I IGGFWLG VDQEGAEGTLS WLGT VFGyLASLCVS LNAI 
YTTKVLPAVDGSIMRLTFYNNVNACILFLPLLLLLGELQALRDF 
AQLG S AH FWGMMTLGGLFG FAI GYVTGLQ IKFTS PLTHNVSGTA 

KACAQTVLAVLYYEETKSFLWWTSNMMVIiGGSSAYrWVRGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 " 
6fe80 " 


2 


711 


RPPRVWYPELRKLSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYf 
IYMGKDKYENEDLIKHGWPEDIWFHVDKLSSAHVYLRLHKGENI 
EDIPKEVIJ4DCAHLVKANSIQ<3CKMNNVN^^ 
DVGQIGFHRQKDVXIVTVEKKVNEILNRLEICTKVPR ppnT ant?*- 

ECRDREERNEKKAQIQEMKKREKEEMKKKREMDELRSYSSLMKV 
ENMS S NQDGNDS DEFM 




62 


1571 


LVALKNWKP KUTN I PAPQSP VFGEAV5G VYMMTKYLGMAPVLGP 
RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 
PREALSQLRVLCCEWLRPEIHTKEQILELUVLEQFLTILPQBLQ 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KrSSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
QDPRKVRDCRLSTQHEESADEQKGSEAEGLKGDI ISVI I ANKPE 
^l^RQCVNLEiraKGTKPPIjQEAGSKKGRESVPTKPTPGERRYl 
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~ SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted and 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid # F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , ^Lysine, 
u-ucuuiiic, n-rjcLuioiuns , N=Aaparagine , 
P=Proline, Q=Glutamine, ReArginine, 
o-jcliuc, i - jiueonine, v=*vaiine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








CAECGKAFSNS SNLT KHRRTHTGEKP YVCTKCGKAFSHSSNLTI* 
HYRTHLVDRPyDCKCGKAFGQSSDLLKHQRMHTEEAPYQCKDCG 

K AF*J f? If n QT.TDHVPT UTV^L" VT> VAr»HU W VP Ttin AT I > t n mmn-w mw 

ivrtr oio avjo jjj. km ikih Ivjr.Ki' i OCWECGJCSFSQHAGLSSHQRLH 
TGEKP YKCKECGKAFNHS SNFNKHKR I HTGEKPY WCHHCGKTFC 
SKSNLSKHQRVHTGEGEAP 


6561 


228 


476 


RVFLKDLSSTPMASNNTASIAQARKLVEQLKMEANIDRIKVSKA 

& AOT.M&V^Tm* &U&VI7T^OT.r 1»nTm» PPunnnpirimnn* w 


65B2 


1428 


718 


CFTTKTHCS PVSVP YhS PLVLRKELES LLENEGDQ VI HTSS F I N 
QHPIIFWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLS 
QDS KLVY I QLLWDN INLHQEPPJB PLYVS WRNFNSE KKSS LLS EE 
QQETSTLVET I RQS IQHNNVLKP INLLS QQMKPGMKRQRS LYRE 
ILFLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILERLQKIDAP 
PSASVEWCRKCFGAPLI 


G583 


487 


41 


RI FSMTSGRLRWRCTWRPATALWSASLRLGTSSMHPS PRSISLP 
LSMMLSPLPSNTRGLS PTALFRS PDSEHATSCPRLHLWRCRAPL 

RSPSPIX3RLQVLPRSPLHVHTHNSGKEVLGLQVQRSRSGTGPAC 
SQAGSGAVQGGNWC I F 


6584 


189 


1750 


PLPMAAIX3PSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFAT 
WNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREEARRKKYG 
IVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTENTSYYIFTQ 
CPTOAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVI^H 
FS I MQQRR LKDQDQDEDEE EKE KRGRRKAS ELR IHDLEDDLEMS 
SDASDASGEEGGRVPKAKKKAPIAXGGRKKKKKKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEE3PKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQBKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGCSSRGNSRPGTPSAE 
GGSTSSTLRAAASKLEQGKRVSBMPAAKRLRIiDTGPQSLSGKST 
PQPPSGBCTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTKK 
TGLSSEO/TVNVIiAQILKRIjNPERKMINDKMHFSLKE 


6S8S 




r" 1678 


GPI RNSR IDDF VGGDPRAEASCS VLHS KPHAMADSRDP ASDQMQ 
HWKEQRAAQKADVLTTGAGNP VGDKLNVITVGPRG PLLVQDWF 
TDEMAHFDRBRIPERVVHAKGAGAFGYFEVTHDITKYSKAKVFE 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTP IFFIRDPILFPSFI HSQKRNPQTHliKD PDM VWDFWS LR 
P ESLHQ VS FLFSDRG I PDGHRHMNGYG3HTFKL VNANGEAVYCK 
FHYKTDQGIKNLSVEDAARLSQEDPDYGIRDLFNAIATGKYPSW 
TFYIQVMTFKQAETFPFNPFDIjTKVWPHKDYPLIPVGKLVLNRN 
P VNYFAEVEQ IAFDPSNMPPG IEAS PDKMLQGRLFAYPDTHRHR 
LGPNYLHIPVNCPYRARVANYQRDGPMCMQDNQGGAPMYYPNSF 
GAPEQQPSALEHS I QYSGE VRRFNTANDDNVTQ VRAF YVNVLNE 
EQRKRLCENIAGHLFCDAQIFIQKKAVKNFTEVHPDYGSHIQALL 
DKYNAEKPKNAI HTFVQSGSHIAAREKANL 


6536 


32 


804 


PLPEQPA2STSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL 
N LGKKT S VP RD VMLEELS LLTNRGS KMFKLRQMR VE KFI YENHP 
DVFSDS SMDHFQK FL PTVGGQLGTAG QG FS YS KSNGRGGSQAGG 
S GSAGQ YGS DQQHHLGSG S G AGGTGG PAGQAGRGGAAGTAG VGE 
TGSGDQ AGGEG KH I TVFKT Y I S PNERAMG VDPQQKMELG I DLLA 
YGAKAELP KYKS FNRTAMP YGGYEKASKRMTFQMPKV 




75 


1117 


RRVPSLGKMPECWDGEHDIETPYGLIjHWIRGSPKGNRPAILTY' 
HDVGLNHKLCFNTF FNFEDMQE ITKHF VVCHVDAPGQQVGASQF 
PQGYQFPSMEQLAAMLPSVVQHFGFKYVIGIGVGAGAYVLAKFA 
LI FPDLVEG LVLVN I DPNGKGW I DWAATKLSGLTS TLPDTVLS H 
LFSQEELVNNTELVQSYRQQIGNWNQANLQLFWNMYNSRRDLD 
INRPGTVPNAKTLRCPVMLWGDNAPAEDGWECNS kldptttt 
FLro^SGGI^QVTQPGKLTEAFKYFLQGMGYMPSASMTRLARS 
RTAS LTS AS S VDGSR PQACTH3E3SEG LGQVNHTME VS C 
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SSQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cyateine, D^Aapartic Acid, e= 
Glutamic Acid. F»Phenvlalanin^ fl-aiv<-in Q 
H»Hietidine, I*Isoleucine, K=-Lysine, 
I>Leucine, M=Methionine, N=Asparagine, 
P- Pro line, Q«Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6586 


13 7 


501 


LGLQAQLLEliRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AQKAI^KSKKAQEVEVLLSBNEMLOAKLHSQEJEDFRLQNSTLMA 
E FSKLCSQMEQLEQENQQLKEGAAGAGVAQAGP 


6589 


2 


1405 


R P WGS AMAT FS ROE FFOOLLOGCLLPTACkSpt hatmt i V is.mr a — 
CRLLWRLGLPSYLKHASTVAGGFFSLYHPFQLHMVV3VVLLSLLC 

ylvlflcrhsshrgvflsvtiliyllmgemhmvdtvt;wkmrga 
qm i vamkavs lgfdldrge vgtvpsp ve fmg yl yf vgti vfgp w 
isfhsyi^avox^plscrwlq^arslalallclvlstcvgpyl 
fpyfiplngdrllrnkkrkargtmvrwlrayesavsfhksnyfv 
gflseatatlagagfteekdhlewdltvskplnvblprsmvevv 
tswnlpmsywlnnyvfknalrlgtfsavlvtyaasallhgfsfh 
laavllslafityvehvlrkrlarilsacvlskrcppdcshqhr 
lglgvrai*nllft3alai fhlaylgslfdvdvddtteeqgygmay 
tvhkwselswashwvtfgcwi fyrlig 


6590 


2177 


656 


VRAYSHVLSLIiENVFTPMFCHRDEYFRQLIAGAESPTRNSKLNR 
vjo no uuu r iu?j x si KKbASs r t»l 5 R IG SKI KG VFKSTTMEGAMLPN Y 

gvaegeddfieegiwmeddspveavstpntprnlaawkisipy 
vdffedpsserxekkeripvfcidverndrravghepehwsvyr 

R YLEF YVLESKLTE FHGAFPDAQLPS KRI IGPKNYEFLKS KREE 
FQ E YLQKLLQHPELSNS QLLAD FLS PNGGETQ FLDK I LP DVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFOTDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLIiMGTRILFK^^^LEMYTDYYXOCKIJ 
EQLFQEHRLVSLITLLRBAIFCENTEPRSLQDKQKGAKQTFEEM 
MNY I PDLLVKCI GEETKYE S IRLLFDGLQQP VLNKQLTYVLLD I 
VI QELFPE LNKVQKE VTS VTS WM 


6S91 


2177 


654 


VRAYE HVLS Ltlj ENVFT PM FCHRDE Y FRQLlRGAES PTRNS KLNR 
oouiiiuut 1 kh iUKKQ»l5SrGISRIGSKIKGVFKSTTMEGAMLPNY 
G VAEGEDDF I EEG I WMEDDS PVEAVST PNTPRNLAAWK I S I P Y 
VD FFE DPS SER KEKKERI P VFC I D VERNDRRAVGHEP EHWS VYR 
RYLEFYVZjESKLTEFHGAFPDAQLPS KRI IGPKNYEFLKS KREE 
FQEYLQKLLQHPBLSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KI IKS VPGKLMKEKGQHLEPFIMNFINSCBS PKPKPSRPELTIL 
S P TS ENNK KL FNDLFKNKANRABNTERKQNQN y fme vmtveg vy 
D YLM YVGRWFQ VP DWLHHLLMGTR I LFKNTLEMYTD Y YLQC KL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNY I PDLLVKC IGEETK YE S I RLL FDGLQQP VLNKQLT YVLLDI 
VIQEL FPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEQRLKAIVAEKFAIATKEG" 
DLPQVERFFKlFPLLGLHEEGIiRKFSEYLCKQVASKAEENLLMV 
LGTDMSDRRAAVI FADTLTLLFEGIARI VETHQP IVETYYGPGR 
LYTLIKYLQVECDRQVEKVVDKFIKQRDYHQQFRHVQNNLMRNS 

MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVTMEEYFMRE 
TVNKAVALDTYEKGQLTS s m VDD VF Y i vkxc igrals ss s I dcl 

CAM INIATTELESDFRD VLCNKLRMGF PATTFQD I QRGVTS AVN 

IMHSSLQQGKFDTKGIESTDEJUOVISFLVTLNNVEVCSENISTLK 

KTLESDCTKLFSQGIGGEQAQAKFDSCLSDLAAVSNKFRDLLQE 

GLTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 

QFILNLEQQMAEFKASLSPVIYDSLTGLMTSLVAVELEKWLKS 

TFNRLGGI^FDKELRSLIAYLTTVTTWTIRDKFARLSQMATILN 

LERVTEILDYWGPNSGPLTWRLTPAEVRQVLALRIDFRSEDIKR 
LRL 


6593 


3 


1837 " 


EAFSAGSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR'" 

RGGEGGGGRGRGDKRRRRQARRQRRRPEPAEARGGKMADVLSVL 

RQYNIQKKEIWKGDEVIFGEFSWPKNVKTMYVVWGTGKEGCPR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H*Histidine, I-Isoleucine, K^Lysine, 
L-Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
SaSerine, T=Threonine, V=valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EY YTLDS ILFLLNNVHLSHPVYVRRAATENI PVVRRPDRKDLLG 
YLNGE ASTS AS I DRSAPLEIGLQRSTQVKRAADEVLAEAKKPR I 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAK I MAKKRS T I KTDLDDD I TALKQRS FVDAE VD VTRD I VS RE 
RVWRTRTT I LQSTGKN FSKN I FAI LQS VKAREEGRAPEQRPAPN 
AAPVDPTLRTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKS VTEGASARKTQTPAAQP VPRP VS QARP P PNQKKGSRTP 
1 1 I IPAATTSLITMLNAKDLLQDLKWPSDEKKKQGCQRENETL 
IQRRKDQMQPGGTAISVTVPYRWDQPLKLMPQDWDRWAVFVQ 
G P AWQFKG WP W LL P DGS P VD I FAKI KA FHLKYDE VRLD PNVQ KW 
DVTVLELS YHKRHLDRP VFLR VWE TLDR YM VKHKSHLR F 


6594 


1 


1096 


EFPGRRFRGSQASPLCATOGPALLRAPTRAAMTRSLFKGNFWSA 
D I LS TIG YDNI IQHLNNGRKNCKEFEDFLKERAAIEER YGKDLL 
NLSRKKPCGQSEINTLKRALBVFKQQVDNVAQCHIQLAQSLREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
Q KCRDKD EAE QAVS RSANLVNPKQQ E KLFVKLATS KTAVEDSD K 
AYMLHIGTLDKVREE WQ S EH I KACEAFEAQE CBR INF FRNAL W L 
HVNQLSQQCVrSDEMYEQVRKSLEMCSIQRDIEYFVNQRKTGQI 
P PAP I MYENFYSSQKNAVPAGKATG PNLARRGPLP I P KS S PDDP 
NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWLYLH 
RYNAYPSEQEKLSLSGQTNLSVLQICNWFlNARRRLLPDMLRKD 
GKDPNQFTI SRRGGKASDVALPRGSS PSVLAVSVPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGEIiESPKPLVTPGSTLTLLTRAEA 
GSPTGGLFNTPPPTPPEQDKEDFSSFQLLVEVALQRAAEMBLQK 
QQ DP S LPLLHTP I P LVS ENP Q 


6S96 


2 


1026 


PRbPVRRYHGRRRUJGRSRGHMAEGDAGSDQRQNEEIEAMAAIY" 

geewcviddcakifcirisddiddpkwtlclqvmlpneypgtap 
p i yqlnapwlkgqeradlsnslee i yi qniges ilylwveki rd 
vl i q ks qmt e pg pd vkkkte eed ve c e ddl i lacq pes s vkald 
fd i setrteveveelpp idhgi p i tdrrstfqahlap wcpkqv 

KMVLSKLYENKKIASATHNI YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLWElLNVKt^VWSRWYGGII^PDRFKHINNCARN 
I L VE KNYTNS P EES SKALGKN FCK VRKDKKRNEH 


6597 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYUJVEKIRD 
VXjlQK.SQMTbPGPDVKKKTEEEDVECEDDLILACQPESS VKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNIYAYRI YCEDKQTFLQDCEDDGETA 
AGGRLLHLME ILNVKNVMVVVSRWYGGI LLGPDRFKHINNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6598 


1099 


419 


P R VR WATTMAMS FEW P WQYRF P PFFTLQPNVDTRQ KQLAAWCS L 
VLSFCRLHKQSSMTVMEAQESPLFNNVKLQRKLPVESIQIVLEE 
LRKKGNLEWLDKSKSSFLIMWRRPEEWGKLIYQWVSRSGQNNSV 
F TL YE LTNGEDTEDEEFHGLD EATLLRALQALQQEH KAE I ITVS 
DGPRRQ VLLAGTCLPLLLTS H LS RAFKRRQTQC P P KTGSVTPPD 
S KG LQS 


6599 


164 


1593 


KMAALTTLFKYIDENQDRYI KKLAKWVAIQSVSAWPEKRGEIRR " 
MM EVAAAD VKQLGGS VELVD I GKQKLPDGSEI PL P P I LLGRLGS 
DPQKKTVCI YGHLDVQPAALEDGWDSEP FTLVERDGKLHGRGST 
DDKG P VAG W I NALEAYQKTGQE I P VNVRFCLEGMEESGS EGLDE 
LIFARKDTFFKDVDYVCISDNYWLGKKKPCirYGLRGICYFFIE 
VECSNKDLHSGVYGGSVHEAMTDLILLMGSLVDKRGNILIPGIN 
E AVAAVTF. EEHKL YDD IDFDIEEFAKD VGAQ I U»H S H KKD ILMH 
R WR YP SLSLHG I EG AFSGSGAKTVI PR KWGKFS I RLVPNMTP E 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, P«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L« Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








v vubw v x a i ill KAriUaJjKSFNlirKVYMGHGGKPW VSDFSHPH YJj 
AGRRAMKTVFGVEPDLTREGGS I P VTLTFQEATGKNVMLLP VGS 
ADDG AHS QNE KLNRYNY I EGTKMLAAYL YB VSQLKD 


6600 


2 


934 


PGRLFRVAAMESAGLEQLLRELLLPDTERIRRATEQLQIVLRAP 
AALSALCDLLASAADPQ I RQPAAVLTRRRLNTRWRRLAAEQRES 
L KS L ILTALQRETEHCV S L SLAQLS AT I PRKEGL EAW PQLLQLL 
QHS THSPH8 PE REMGL LLLS WVTSR PEAFQ PHHRELLRL IiNET 
LG E VGS PGLL FYS LRTLTTMAP YLSTEEVPLARMLVPKL IMAMQ 
TLIPIDEAKACSALEALDELLESEVPVITPYLSEVLTFCLEVAR 
NVALGNAIR I R I LCCLTFLVKVKS KALLKNR LLATLAAHPFPHC 
GC 


6601 




TTTn 

1420 


PRAAARAPPPAVLRRDRRAATAPGAGEMTLHGPLAQRYFLNHIE 
KI TTWQDP RKAMNQP LNHMNLHP AVSS TP VPQRSMAVS Q PNLVM 
NHQHQQQMAPSTLSQQNHPTQNPPAGLMSKPNALTTQQQQQQKL 
R LQR I QMERER I RMRQEELMRQEAALCRQL PMEAETIiAPVQAAV 
NP PTMTPDMRS I TNNSSDP FLNGGP YHSREQSTDSGLGLGC YS V 
PTTPEDFL5NVDEMDTGENAGQTPMNINPQQTRFPDFLDCLPGT 
NVDLGTLESEDLI PLFNDVESALNXSE PFLTWL 


6602 


127 


617 -" 


LLD FPALPKFVLAQS PKAGKPSTMTSMTQSLRE VI KAKTKARNF 
ER VLGKITLVS AAPGKV ICEMKVE BEHTNA I GTLHGGLTATLVD 
NI S TMALLCTERGAPG VS VDMNI T YMS PAKLGED I VITAHVLKQ 
GKTIAFTSVDLTNKATGKLIAQGRHTKHLGN 


6603 


/ y 


660 


PVGPSSLAARTGLGHLPFLHRLASSRGLDMDLLQFLAFIjFVLLL 
SGMGATGTLRTSLDPSLE I YKKMFE VKRREQLLALXNLAQLNDI 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEXLKD 
AFSHWENTAFFGDWLRFPRIVHYYFDHNSNWNLLIRWGISFC 
NQTGVFNQGPHS P ILSLM 


1*04 


3 


688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS 5NHFRGGG 
GGGGGGNFRGG GRGGFGRGGG RGG FNKGQDQGP P ERWLLGEFL 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEIFGQLR 
DFYFSVKLSENMKASSFKKLQKFYIDPYKLLPLQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


848 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDROLDDVIjEKAKKANVV 
ALVAVAEHSGEFEKI MQLSERYNGFVIiP CLG VHP VQGLPPEDQR 
SVTLKDLDVALP I IENYKDRLLAIGEVGLDFSPRFAGTGBQKEE 
QRQVLIRQIQIiAKRLNLPVNVHSRSAGRPriNLLQEQGAEKVLL 
HAFDGRPSVAMEGVRAGYFFS IPPSI 1RSGQQKLVKQLPLTSIC 
LETDS PALGPEKQVRNEPWNI S ISAB YI AQVKGI SVEE VIE VTT 
QNALKL F P KLRHLLQ K 


6606 


2 


1682 


FVEIRPRAEVANLSAHSASPIQDAVLKRLSLLEDIVYRQLNGLS" 
KSLGLIEGYGGRGKGGLPATXjS paee EKAKGPHEKYGYNS YLSE 
K I SLDRS I P DYRPTKCKE LKYS KDLPQI S 1 1 F I FVNEALS VI LR 
SVHSAVNHTPTHLLKEIILVDDNSDEEELKVPLEEYVHKRYPGL 
VKWRNQKREGLIRARIEGWKVATGQVTGFFDAHVEFTAGWAEP 
VLSR I Q EN RKRVI L PS I DN I KODNFEVORYENSAHG YS V7PT.WPM 
YISPPKDWMDAGDPSLPIRTPAMIGCSFWNRKFFGEIGLLDPG 
MDVYGGENIELGIKVWLCX3GSMEVLPCSRVAHIBRKKKPYNSNI 
GFYTKRNAIjRVAEVWMDDYKSHVYIAWNLPLENPGIDIGDVSER 
RALRKSLKCKNFQWYIjDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQGPLENHTAILYPCHGWGPQIJUIYTKEGFLHLGAI,GTTTIiLP 
DTRCI/VDNS KSRLPQLLDCDKVKSS LYKRWNFIQNGAIMNKGTG 
RCLEVENRGLAGIDLI LRSCTGQRWTI KNS I K 


6607 


137 


98£ 


VPACAGLKJCEARSLIASPPRLLNTKLQASCRALFSPPIQSRQTT 
GISFGGRGGAGPGVPTRTQVPAAMGAVMGTFSSLQTKQRRPSKD 

kiedelemtmvchrpegleqleaqtnftkrelqvlyrgfknecp 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresDondina 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine f C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 

H = H1 «it* 1 H i nf* T-Tcnl Angina V»Ti>e4nA 

n-xixt»Lj.ui.iie, ±— jisoieucme , K.s=L»ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R«Arginine, 
S=Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








" u v »n«uit rv v -L Iriyc r rnuUrto 1 I Aril J-irls Ar UL VKrfi 

DPVTALS I LLRGTVHE KLRWTFNLYD INKDG Y INQEEMMD I VKA 
IYDMT4GKYTYPVLKEDTPRQHVDVPFQKMDKNKDGIVTLDEFLE 
SCQEDDNIMRSLQLFQNVM 


6608 


224 


1140 


rurur oairxu Jjv_fkj_>£> XiTljLIjI^jiAYJjrrrKQPSPSPPMS VATRS 
TGTLQL PPQKP FGQE AS L PLAGBEE LS KGGEQDCALEELCKPLY 
CKLCNVTLNS AQQAQAHYQG KNKGKKLRNYYAANS CP PPARMS N 
WEPAATPWP VPPQMGS FKPGGRVI LATENDYCKLCDASFSS P 
AVAQAHYQGKNHAXRLRLAEAQSNS FS ESSELGQRRARKEGNEF 
win FWKttwm x 1 vywNMs i> x r N PKSRQR I PRDIiAMCVTFSGQF YC 
SMCNVGAGEEM E FRQHLE S KQHKS KVS EQR YRNEMENLG YV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAEPGNFQLSPAEPRGPLaSPVRAAPRAPCPAAEMSELNTKTS 
P ATNQAAGQEE KGKAGNVKKAEE EE E I DI DLTAP ETEKAALAI Q 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKSLCNLHIFIRFPLTYPDMYMGMMCTAKKCGIRFQPPAIILI 
YESE I KGKIRQR IMPVRNFS KFSDCTRAAEQLKNNPRHKS YLEQ 
VS LRQLE K L F S FLRG YLS GQSLAETMEQ I QRETT I DPEEDLNKL 
DDKELAKRKSIMDELFEKNQKKKDDPNFVYD1EVEFPQDDQLQS 
CGWDTESADEF 


6611 


978 


212 


PGCSGAGSRVWWLPALRHLAMGSTESSEGRRVSFGVDEEERVRV 
LQG VRLS ENWNRMKE PSS P PPAPTSSTFGbQDGNIiRAPHKEST 
LPRSGS SGGQQ PS GMKEGVKRYEQEHAA IQDKLFQVAKR ER EAA 
TKHSKASLPTGEGSISHBEQKSVRLARELESREAELRRRDTFYK 
EQLERIERKNAEMYKLSSEQFHEAASKMESTIKPRRVEPVCSGL 
QAQILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKG 


6612 


1724 




VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGS KM 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKLSTSAKRIQKELAEITLDPPPNCSAGPXGDNIYEWRS 
TILG PPGS VYEGGVFFLD ITFS PDYPFKPPKVTFRTR I YHCNXN 
SQG V I CLDILKDNWSPALTI S XVLLS I CSLLTDCNPADPLVGS I 
ATQYMTNRAJE HDRMARQWTKRYAT 


6"6"l3 " 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSLVLRFVKGTFRESYI ' 
P TVEDT YRQ VI S CDKS ICTLQ I TDTTG SHQFP AMQRLS I S KGHA 
FILVYSITSRQSLEELKPIYEQICEIKGDV3SIPIMLVGNKCDE 

13 r v \joo OniSALiAK 1 vi Ai~ A trial o A KUjNHN VK£ LFQEXjLNIjE 

KRRTVSLQIDGKKSKQQKRKEKLKGKCVIM 


6614 


3 


1191 


S SAAEAMRVLVRRCWGPPLAHGARRGRPSPQWRAIJUUX3WEDCR 
DSRVREKPPWRVLFFGTDQFAREALRALHAARBNKEEELIDKLE 
WTMPS PS PKGL PVXQYAVQS QL PVY E W PDVGSGE YDVGWAS F 
GRLLNEALILKFPYGIWmiPSCXPRWRGPAPVIHTVLHGDTVT 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 

i iux.w3 vijxuNijroaxjoWofCyyi'f'ir.ijAi. XAtflvJ.bAXjliulKWEEQTS 

EQIFRLYRAIGNI IPLQTLWMANTIKLLDLVEVNSSVLADPKLT 
GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NGYLHPWYQKNSQAQPSQCRFQTLRLPTKKKQKKTVAMQQCIE 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSLRLQTDARKVRCILTGHE 
LPCRLPELQVYTRGKKYORLVRASPAFDYAEFEPHIVPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACLVHRRRRRBDQMDGDGPRPREAFWEPTSSDEGGAASDDSM 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 
EGRRETTVYRGLVQKRGKKQIiGSLKKKPKSHHRKPKS FSSCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPIiQYP 
VWPLLLVITQI PAPRHLRNRPFS FSRGGLDSFSGSLSTPS ICRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I = I sol eu cine, KsLysine, 
L=Leucine, Methionine, N=Asparagine ( 
P=Proline, Q«Glutamine, R-Arginine, 
S=Serine, T«Threonine, V-Valine, 
W=Tryptophan, YaTyrosine, X= Unknown, ♦-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWP PKGLV PAVLWGLSLFLNLPGPI WLQPSPPPQSS P 
PPQPHPCHTCRGLVDSFNKGLERTIRDN PGGGNTAWEEENLS KY 
KDSETRLVEVLEGVCSKSDFECHRLLELSEEtiVESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTBRPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYFEAERNASHLVCSACF 
GPCARCSGPEESNCLQCKKGWALHffljKCVDIDECGTEGANCGAD 
QFCVNTEGSYECRJDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
LDVDECETEVCPGENKQCENTEGGYRCICAEGYKQMEGICVKEQ 
IPESAGFFSEMTEDELWLQQMFFGI 1 1 CALATLAAKGDLVFTA 
IFIGAVAAMTGYWLSERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVS LLELEDRLQCP ICLEVFKESLMLQCGHSYCKGCLVS 
LS YHLDTKVRCPMC WQ AVDGS S SL PNVS LAWVIEALRL PGD PEP 
KVCVHHRNPLSLFCE KDQEL I CGLCGLLGS HQHHPVTP I STVCS 
RM KE E LAALFSELKQ E Q K KVDE L IAKLVKNRTRT dc; q A p<? t ,r p r 
LGPATFTFL 


6618 


54 8 


1U 


dgkvarrapnspafqndiyplvsaprAttaespwskvlqntqcr 

NVPKMTSBRSRIPCLSAAAAEGTGKKQQEGRAMATLDRKVPSPE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PASSEVLTAAVMFLLLNCIVAVSQNMGIGKNGDLPRPPLRJNEFR 
YFQRMTTTSSVEGKQNLVIMGRKTWFSIPEKNRPLKDRINLVLS 
RELKEPPQGAHFLARSLDDALKLTERPELANKVDMIWIVGGSSV 
YKEAMNHLGH LKL FVTR IMQDFESDT FF SE I DLEKYKLLPE YPG 
I LS DVQ EGKJH I KYKFE VCE KD D 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQESALGAYSPVDYMSITSFPRLPE 
DE P APAAP LRGRKDEDAFLGD PDTD PDS FLKSARLQR LPS SSS E 
MGS QDGS PLRETRKDP FS AAAAECSCRQDGLTVI VTACLTFATG 
VTVALVMQIYFGDPQIFQQGAVVTDAARCTSIjGIEVLSKQGSSV 
DAAVAAALCLGIVAPHSSC5LGGGGVMT,VHnTDDicrPQUT thcduc 

APGALREETLQRSWETKPGLLVGVPGMVKGLHEAHQLYGRLPWS 
OVLAFAAAVAQDGFNVTHDIJUU^AEQLPPNMSERFRETFLPSG 
R P PL PG SLLHR PDliAE VLDVLGTS GPAAFYAGGNLTLEMVAEAQ 
HAGG VITEEDFSNYSAXtVEKPVCGVYRGHLVLS PPPPHTGPAL I 
SALNILEGFNLTSLVSREQAIiHWVAETLKlALALASRLGDPVYI) 
S T I TESMDDMLS KVEAAYLRGH INDSQAA P APLLPVYELDGAPT 
AAQVL IMGPDDF IVAMVSSLNQ PFGSGLITPSG I LLNSQMLDFS 
WPNRTANHSAPSLENSVQPGKRPLSFLLPTWRPAEGLCGTYLA 
LGANGAARGLSGLTQVRFTPWLAFFSREPSCGLDCRCLSYLWLV 
SIPHAANMG 


6621 


1 


662 


VQG I TS YQQRLQALRKEKSRDAARS rrgkenfef yelakllpl p 
AAI TSQLD KAS 1 IRLT IS YLKMRDFANQGDPP WNLRMEGPPPNT 

svkvigaqrrrspsalaievfeahlgshilqsldgyvfalnqeg 
kflyisetvsiylglsqveltgssvfdyvhpgdhvemaeqlgmk 
lppgrgllsqgtaedgassassssqsetpepwcfppasdqfll 




2 


319 " 


ui^sgaqeeteaggperarambanmpkrkepgrslrikvismgn 

AEVGKSCI I KRYCEKRFVSKYLATIGIDYGVTKVHVRDRE IKVN 
IFDMAGHPFFYEVRKPF 


6623 


1886 


189 


KALFEKVKKf'RLHVEEGDILYAMYVRQTVLKVIKFLI I IAYNSA*"™ 
LVSKVQFTVDOmilQDMTGYKNFSCiraamHLFSKLSFCYLCF 
VS I YGLTCLYTLY WLF YRS LRU YS FE YVRQETGFDDI PD VKNDF 
AFMLHM I D QYDP LYS KRFAVFLS E VS ENKLKQLNLNNE WT PDKL 
RQKLQTNAHNRLELPLIMLSGLPDTVFEITELQSLKLEI IKNVM 
I PAT I AQLDNLQELS LHQCS VK I HS AALS FLKENLKVLS VKFDD 
MRELPPWMYGLRNLBELYLVGSLSHDISRNVTLESLRDLKSLKI 
LS IKSNVSKr PC^VVDVSSHLQKMCIHNDGTKLVMLNNLKKMTN 
LT ELELVHCDLERI PHAV FS LLS LQELDLKENNLKS 1 E E I VS FQ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine / G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, ^Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /«poesible nucleotide deletion, 
\-possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHlKKIiTSLERLSFSHNKIEVLPSH" 
LFLCNKIRYLDLSYNDXRFIPPEIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI 
LPPELGDCRALKRAGLWEDALFETLPSDVREQMKTE 


6624 


218 


1786 


GSRRGGGSRI PAVS THVAPGRS VLRPFASGALRLRSLVKALGGC 
RGRPSGLAHLSQETSHWRAKRSGRACLGDPPGEILRSFIMKCTA 
REWLRVTTVLFMARAI PAMWPNATLLEKLLEKYMDEDGE WW I A 
KQRGKRAITDNDMQSILDLHNKLRSQVYPTASNMEYMTWDVELE 
RS AE SWAES CLWEHG PAS LL PS I GQNLGAHWGR YR Pp TFH VQS W 
YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AINLCHNMN I WGQ I WP KAVYLVCNY S P KGNWWGHAPYKHGRPCS 
ACPPSFGGGCRENLCYKEGSDRYYPPRBEETNEIERQQSQVHDT 
HVRTRSDDSSRNEVIS AQQMSQ I VS CE VRLRDQCKGTTCNRYEC 
PAGCLDSKAKVIGSVH YEMQS3 1 CRAAIHYGI IDNDGG WVDITR 
QGR KHY F I KSNRNG I QTIGK YQS ANS FT VS KVTVQAVTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF 


6625 " 


1124 


543 


PG P RGGGGS LLSTKALGRS RGLGMH PG PS SGGTEGG VPTXLR P P 
GPLVPSTSDDNLLKNIELFDKLALRFHGRLLFLKDVLGDEICCW 
SFYGQGRKI AEVCCTS I VYATEXKQTKVBFPEARIFEETLNILI 

YETPRGPDPAbLEATGGAAGAGGAGRGEDEENREHRVRRIHVRR 
HITHDERPHGQQIVFKD 


6626 


3 


14 98 


SAVEFVYTDRFHLILGISVEFLCSLRSDATMESITACLHALQAL 
LDVPWPRS KI GS DQDSG I ELLNVLH R V I LTRE S PS IQLASLE W 
RQ 1 1 CAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKS LVFATLELCVCI LVRQLPELNPKLTGS PGVKATKPQ I LLED 
GSRI/VS AAL VILS EL PAVCSPEGS I S ILPTIL YL TIG VLRE TAV 
KLPGGQLSSTVAASLQALKGILSSPMARABKSRTAWTDLLRSAI, 
TTI LDCWDP VDETHQBLDE VS LLTAI TVF I LS TS PEVTT I PCLQ 
KR C I DKFKATLE I KDP WQI KTYQLLHS I FQ YPNPAVS Y P Y I YS 
LASCIMEKLQEIDKRKPENTAELEI FQEGIKVLETLVTVAEEHH 
RAQLVACLLP IMS FLLDENSLGS ATS IMRNLHDFALQNLMQIG 

PQYSSVFKSLVASSPALKARLEAAIKGNQESVKVKIPTSKYTKS 
PGKNS S I Q LKTS FL 


6627 


1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL " 
GDTG VGKTCFL I Q FKDGAFLSGTF I ATVGI DFRNKWT VDGVRV 
KLQ I WDTAGQE RFRS VTHAY YRDAQAIiLLL YD I TNKS S FDN I RA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFLETSAKTGMNVELAFLAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN 
KEFGDSLSLEILQIIKBSQQQHGLRHGDFQRYRGYCSRRQRRLR 
KTLNFKMGNRH KFTGKKVTE ELLTDNRYLLLVLMDAERAWS YAM 
QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 
LE AQAYTAYLSGMLRFEHQE WKAA I EAFNKCKT I YE KLASAFTE 
EQAVLYNQRVEEISPNIRYCAYNIGDQSAINELMQMRLRSGGTE 
GLtiAEKLEALITQTRAKQAATMSEVEWRGRTVP VKIDKVR I FLL 
vjLiaujn iLJ\j\x VQAE b EETKERLFESMLS ECRDAI Q WREELK PDQ 
KQRDYILEGEPGKVSNLQYLHSYLTYIKLSTAIKRNENMAKGLQ 
RALLQQQPEDDS KRS PR PQDL I RL YD 1 1 LQNLVE LLQLPGL BED 
KAFQKBIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 
YANEVNSDAGAFKNS LKDL PDVQEL ITQVRSEKCS LQAAAI LDA 
NDAHQTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 
GFQPIPCKPLFFDLALNHVAFPPLEDlCLEQiCrKSGLXXSYIKGIF 
GFRS 


6629 


5653 " 


4549 


GATPLGS VGGRTGKMDAATLT YDTLR FAEFEDFPETSE PVW I LG " 
RKYSIFTEKDEILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, RsArginine, 
o-ocune, i-inreomne, Vavaiin6 ( 
W=Tryptophan, Y*Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








n ukiaium J. rAyALiv UKliLGRDWRWTQRKRQPDSYFSVIjNAFIDR 
KDSYYS IHQ I AQMGVGEG KS IGQWYGPNTVAQVLKKLAVFDTWS 
SLAVHIAMDNTVVMEBIRRLCRTSV?CAGATAFPADSDRHCNGF 
PAGAEVTNRPSPWRPLVLLI PLRLGLTDINEAYVETLKHCFMMP 

nCT n\FT VI5XTG7\LJVt?T/^VlT/"»CT5T T >rr m~iTTfnfTij~L n «. ■ ■ i i ■ l.i.hu 

U&iA»viLivj&£'ftbiUixr ICji VUEELXYbDPHTTQPAVEPTDGCFI 
PDESFHCXJHPPCRMSIAELDPSIAVVRGGHLSTQAFGAECCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
IjGDQYVKDE FRRHKTVGSDEAQR FLQ K WEV Y ATALLQQ ANENRQ 
NS TGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFS I 


6631 


2 


423 


LV0CX3GIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS"" 
LGDQWKDE FRRHKTVGSDEAQR FLQEWEVYATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 


6632 


1273 


588 


WNSRGRTQRGAAPLAPAAAMKAWQRVTRASVtVGGEQISAIGR 
GICVLLGISIiEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQYEILCVSQFTIiQCVLKGNKPDFHLAMPTEQAEGFYNSFLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 
KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


" 6633 


1145 


617 


ATGRHEG VPTLEG I IQQLVNG I IT PAT I PSLG PWG VLHSN P MD Y 
AWGANGLDAIITQLLNQFENTGPPPADKEKIQALPTVPVTEEHV 
GSGLECP VCKDDYALGERVRQLPCNHLFHDG C 1 VPWLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


' 1134 


CGGIPRKGSGPRRRLPMARLRDCLPRLMLTLRSLLFWSLVYCYC 
GLCASIHLLKLLWSLGKGPAQTFRRPAREHPPACliSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPLMLLIiHGFPEFWYSWRYQLREF 
KSEYRWALDLRGYGETDAPIHRQNYKLDCLITDIKDILDSLGY 
SKCVL IGHDWGGM IAWLIA I CYPEM VMKLI VINFPHPNVF7E YI 
r,RHPAQLLKSSYYYFFQIPWFPEFMFSINDFKVLKHLFTSHSTG 
IGRXGCQLTTBDLEA Y I YVFS Q PGALSG P I NH YRN I FSCLPLKH 
HMVTTPTLLLWGENDAFMEVEMAEVTRFYVKNYFRLTILSEASH 
WLQQDQPD I VNKL I WTFLKEETRKKD 


6635 ' 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGXjGPHGPSFARVPVAPSSSSG 
GRGGAEPRPLPLSYRIjIjDGEAALPAVVFLHGLFGSKTNFNSIAK 
I LAQQTGRRVLTVDARNHGDS PHSPDMS YEIMSQDLQDLL PQLG 
LVPCVWGHSMGGKTAMLLALQRPELVERLIAVDISPVESTGVS 
HFATYVAAMRAINIADELPRSRARKLADEOLSSVIODMAVRQHli 
LTNLVEVDGRFVWRVN1jDALTQHIJ5KILAFPQRQESYIX3PTLFI, 
LGGNSQFVHPSHHPE IMRLFPRAQMQTVPNAGHWI HADRPQDF I 
AAIRGFLV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE - 

QPIVRQCLQRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 

DGGDGVF 


6637 


2 " 


1501 


CSS5PCFHDGTCVLDKAGSYKCACIAGYTGQRCBNLLBAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRHAKIGT 
WS FFCNNS YVLSGNEKRTCQQNGEWSGKQPICT KACREPKISD 
LVRRRVLPMQVQSRETPLHQL YSAAFS KQKLQSAPTKKPALPFG 

dlpmgyqhlhtqlqyecispfyrrlgssrrtclrtgkwsgraps 
cipicgkienitapktqglrwpwqaaiyrrtsgvhdgslhkgaw 
flvcsgalvnert vwaahcvtdlgkvtmi ktadlk wlgkfyr 
dddrdektiqslqisaiilhpnydpilldadiailklldkaris 
trvqp iclaasrdlsts fqeshi t vagwnvladvrspg fkndtl 
rsgws wdsllceeqhedhg i pvs vtdnmfcasweptapsdi c 
taetgg iaavs fpgras pe pr whlmglvs ws ydktcs hrls taf 
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SEQ 
ID 
NO: 


Predict e"3 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F» Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, | 
L=Leucine, M=Methionine, N=Asparagine , 
peproline, Q=Glutamine, R«Arginine, 
S=Se rine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GGIPQAGGKMAAPWWRAALCECRRWRGFSTSAVIX3RRTPPLGPM 
PNS0IDLSNLERLEKYRSFDRYRRRAEQEAQAPHWWRTYREYFG 
EKTDPKEKIDIGLPPPKVSRTQQLLERKQAIQELRANVEEERAA 
RLRTASVPHiDAVRAEWERTCGPYHKQRLAEYYGLYRDLFKGATF 
VPR VPLHVAYAVGEDDLM P VYOGN E VT PTE aaqap b vtyeaeeg 
SLWTLLLTSLDGKLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 
PPFPARGSGIHRLAFLLFKQDQPIDFSEDARPSPCYQLAQRTFR 
TFDFYKKHQETMTPAGLS FFQCRWDDSVTYI FHQLLDMRE PVFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


, 6639 


2046 


1268 


IGCFIMDC5GDDGNLIIKKRFVSEAELDERRKRRQEEWEKVRKPE 
DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 

NKKEVEKKLTVKPIETKNKFSQAKLLAGAVKHKSSESGNSVKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCFSAAVCIGILPGL 
GAYSGSSDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6640 


117 


1043 


VLEPPDVSMAESEDRSLRIVLVGKTGSGKSATANTILGEEIFDS 

RIAAQAVTKNCQKASREWQGRDI.I.WDTPGLFDTKESLDTTCKE 

ISRCIISSCPGPHAIVLVLLLGRYTEEEQKTVALIKAVFGKSAM 

vtiMVT T i PTR VP PT .Pftfi Q punpT AnATWOTiKfl T VKP C CAP^ 
Ann v liir xnj\Ctaua\3^iOC ciijc d.t\ut\u vvjxji\o -l v rv^cunivn-fu; o 

NSKKTSKAEKESQVQELVELIEKMVQCNEGAYFSDDIYKDTEER 

LKQREEVLRKI YTDQLNE E I KLVE EDKHKS EE KKE KEI KLL KLK 

YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 




1 


894 - 


^VGRRSEVR6cAPRPRLRRSARRMDPVPGTDSAPLA^t»A^SS 
ASAPPPRGFSAISCTVEGAPASFGKSFAQKSGYFLCLSSLGSLE 
NPQENWADIQIWDKSPLPLGFSPVCDPMDSKASVSKKKRMCV 
KLLPLGATDTAVFDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APR P VPKPRGLS RDMQGLS LDAASQP S KGGLLERTAS RLG SRAS 
TLRRNDS I YEAS SL YG ISAMDGVPFTLHPRFEGKS CS PLAFSAF 
GDI ,T T ICS TAD T ERE YNYG FWE KTAJLARLP P S VS 


4642 


22 


1296 


PLEERMMTKMDPNDQAQRDIIFELRRIAFDAESDPSNAPGSGTE 

KRKAMYTKDYKMI^FTNHINPAMDFTQTPPGMIJ^ 

HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 

NEGRNDYHPMFFTHDRAFEELFGICIQLLNKTWKEMRATAEDFN 

KVMQWREQITRALPSKPNSLDQFKSKLRSLSYSEILRIiRQSER 

MSQDDFQSPPI VELREKIQPEILELI KQQRLNRLCEGSS FRXIG 

NRRRQERFWYCRLALNHKVLHYGDLDDNPQGEVTFBSLQEKI PV 

ADIKAIVTGKDCPHMKEKSALKQNKEVLELAFSILYDPDETLNF 

I APNKYE YC I W I DGLS ALLG KDMSS E LT KSDLDTLLS MEM KLRL 

LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


304$ 


^~ 2265 


SLHAPAEGRTRGRLAEKPKMLTRKIKLWD1NAHITCRLCSGYLI 
DATTVTECLHTFCRSCLVKYLEENNTCPTCRIVIHQSHPLQYIG 
HDRTMQDIVYKIjVPGIiQBAEMRKQREFYHKLGMEVPGDIKGETC 
SAKQHIiDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CL ECNS S KLRG LKRXW IRCS AQATVLHLKKF1 AKKLNL S S FN EL 
DILCNEEILGKDHTLKFVVVTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


PR PLATE PRGS S P VQLVSSTMSVRTLPLLFLNLGGEMLY I LDQR 
LRAQN I PGDKARKVLND 1 1 STMFNRKFMEELFKPQELYS KXALR 
TVYERLAHAS IMKLNQASMDKLYDLMTMAFKYQVLLCPRPKDVL 
LVTFNHLDT I KGF I RDS PT I LQQVDETLRQLTE I YGGLS AGE FQ 
LIRQTLIilFFQDLHIRVSMFLKDKVQWNNGRFVLPVSGPVPWGT 
EVPGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KLGTNMYSVNQPVETHVSGSSKNLASWTQESIAPNPLAKEELNF 
LARLMGGME IKKPSGPEPGFRLNLFTTDEEEEQAALTR PEELS Y 
E VIN I QATQDQQRS EELAR I MGEFE I TEQPRLSTS KGDDLLAMM 
DEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K*Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arglnine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 


£645 


6530 


4646 


FVEGLAGYVYKAASEGKVLTLAALLLNRSESDIRYLLGYVSQQG 
GQRSTPL1 1 AARNGHAKVVRLLLEH YRVQTQQTGTVR FDG YV I D 
GATALWCAAGAGHFE WKLL VS HG ANVNHTTVTNSTPLRAACFD 
GRLDIVKYLVENNANIS I AN KYDNTCLM I AAYKGHTD WR YLLE 
QRADPNAKAHCGATALHFAAEAGHIDI VKELI KWRAAI WNGHG 
MTPLKVAAESCKADWELLLSHADCDRRSRIEALELLGASFANB 
RENYDIIKTYHYLYLAMLERFQDGDNILEKEVLPPIHAYGNRTE 
CRNPQELES I RQDRDALHMEGLI VRERI LGADNIDVSHP IIYRG 
AVYADNME FEQCI KLWLHALHLRQKGNRNTHKDLIiRFAQVFSQM 
IHLNETVKAPDIECVIiRCSVLEIEQSMNRVKNISDADVHNAMDN 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFTLLHLAVNSNTPVDDFHTNDVCS FPNALVTXLLLDCGAEVNA 
VDNEGNSALHI IVQYNRPISDFLTLHS 1 I ISLVEAGAHTDMTNK 
QNKTPLDKSTTGVSEILLKTQMKMSLKCLAARAVRANDINYQDQ 
IPRTLEEFVGFH 


6646 

CCA"! 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWIIELNVNGGIENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLIIAYAVCRLRHWWAIAL 
TTAVTS AF LLAKVILS KLFSQG AFG YVLP I 1 S F I LAW I ETWFLD 

FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GS EEAEE KQDS EKPLLE L 




176 


890 


PS S RMNHL P E DM ENALTGS QS SHAS LRN I HS I NP TQLMARI BS Y 
EGR E KKGI SD VRRTFCL FVTFDLL FVTLLWI I ELNVNGGI ENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILSKLFSQGAFG YVLP I IS F ILAWIET WFLD 
FKVIiPQEAEEENRLLI VQDASERAALI PGGLS DGQFYSPPESEA 
GS EEAEE KQDS EKPLLE L 


C648 


" 413 


897 


RNCWNCFTKYFNSPPEDIDHKDSYLITRSIMAEPDYIEDDNPEL" 
IRPQKLINPVKTSRNHQDLHRELIiMNQKRGLAPQNKPBLQKVME 
KRKRDQVIKQKEEEAQKKKSDLEIELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 




1357 


832 


W I PRAAG I RHE VKWD VKE I MSQHN I Y VDALL KE FEQFNRRLNEV " 
S KRVRI PLP VSNI LWEHC X RLANRT I VEG YANTVKKCSNEGRALM 
QLDFQQFLMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWI 
KEHREYSTKQLT^VNVCLGSHINKKARQKLLAAIDDIDRPKR 


■^50 


32 


765 


LVPLVFSLLVQSCKQVYRSIAMKFVPCLLLVTLSCLGTLGQAPR "" 

QKQGSTGEEFHFQTGGRDSCTMRPSSLGQGAGEVWLRVDCRNTD 

0/TYWCEYRGQPSMCX3AFAJu^PKSYWNQALQELRRLHHACQGAPV 

LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 

VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 

KKAWEHCW KP FQALCAFL I S F FRG 


MSI 


3425 


1353 


AKELL K VGDFSLCAGP YOJNTADTMENLSKBPIiAS FVSES FDI SA " 

CGIATEHVKIDNSGEGLTAEAGSETLSRDGEVGVNSDMHYELSG 

DSDLDLLGDCRNPRLDLEDSYTLRGSYTRKKDVPTDGYESSLNF 

HNNNQEDWGCSSWVPGMETSLPPGHWTAAVKKEEKCVPPYVQIR 

DLHG I LRTYAN FS I TKELKDTMRT SHGLR RH PS FS ANCGLP S S W 

* « * « w » rujuu i yix i uuua i UK r AH iUj A.y 1 X KNCil/SQHSASSANV 

FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLLVTWESDP 

RPQGQPRRGYTASSLDSSSSWRERCSHNRDLRNSQRNHTVSFHL 

NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 

DVSGEATAQEMYLPFPGRSAS YEDII IDVCTNLHVXLRS WKEA 

CKSTFLFYLVETEDKS FFVRTKNLLRKGGHTEI E PQHFCQAFHR 

ENDTLIIIIRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 

LDHTYQELFRAGGFVISDDKILEAVTLVQLKEI I KILEKLNGNG 

RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 

YHQCDSRSSTKAEILKCLLNLQIQHIDARFAVLLTDKPTIPREV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine, N«Aaparagine, 
P»=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T= Threonine, V=Valine, 
W-Tryptophan, Y*=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FENNGI LVTD VNNF I EN I EKIAAP FRSS YW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAA^PRPRPPSLPLPLPL 
P FP LFLP TRPAERAW I RS RRAS E WVGKME VPRLDHALNS PTS PC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 
TCDS FKI S WEMDS KS KDRI THYFIDLNKKENKNSNKFKHKDVPT 
KLVAKAVPLPMTVRGHWFLS PRTEYTVAVQTASKQVDGDYWSE 
WSEIIEFCTADYSKVHLTQLLEKAEVIAGRMLKFSVFYRNQHKE 
YFDYVREHHGNAMQPSVKDNSGSHGSPISGKLEG1PFSCSTEFN 
TGKPPQDSPYGRYRFEIAAEKLFNPNTNLYFGDFYCMYTAYHYV 
ILVIAPVGSPGDEFCKQRLPQLNSKDNKFLTCTEEDGVLVYHHA 
QDVILEVIYTDPVDLSLGTVAEITGHQLMSLSTANAKKDPSCKT 
CNISVGR 


6653 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
RVAAAASRGADDAMES SKPGPVQWLVQKDQHS FELDEKALASI 
liLQDHIRDLDWVVSVAGAFRXGKSFILDFMLRYLYSQKESGHS 
NWLG DP E E P LTGFS WRGGSD PETTG I Q I WSE VFTVE K PGGKKVA 
WLMDTQGAFDSQSTVKDCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTEYGRLAMDEIFQKPFQTLMPLVRDWSFPYEYSYGL 
QGGMAFLDKRLQVKEHQHSEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATSPDFDGKLKD1AGEFKEQLQALIPYVLNPSKLMEKEING 
S KVTCRGLLE YFKAYI KI YQGEDLPHPKSMLQATAEAYNLAAAA 
S AKD I YYNNMEE VCGGE KP YLS P D I LEEKH C E FKQLALDHFKKT 
KKNGGKDFSFRYQQELEEEIKELYENFCKHNGSKNVFSTFRTPA 
VLFTG I VAL YI ASGIiTGFIGLE WAQLPKCMVGLLL IALLTWGY 

IRYSGOYRELGGAIDFGAAYVLEQASSHIGNSTQATVRDAWGR 
PSMDKKAQ 


£654 


1 


705 


RTS LS PSQ CS S FNLAMAS AGMQI LG VVLTLLG WVNGLVS CALPM 
WKVTAFIGNS I WAQWWEGLWMS C WQSTGQMQ CKVYDS LLAL 
PQDLQAARAL CVIALLVALFGLLVY LAG AKCTTCVEEKDS KARL 
VLTSG I VF VI SGVLTLI PVCWTAHAVIRDFYNPLVAEAQKRELG 
AS L YLGWAASGLLLLGGGLLCCTC P S GGSQGPSHYMAR YS TS AP 
AISRGPS EYPTXNYV 


6655 


341 


1* 


KDA YM FKKGLLALALVFSLP VFAABHW I DVR VPEQ YQQEHVQGA 
INI PLKEVKER I ATAVPDKNDTVKVYCNAGRQSGQAKE ILS EMG 
YTHVENAGGLKD I AMP KVKG 


6656 


2 


1212 


TELPPRPANLAIQPPLSPLRALAPLPEKPGAVPPPQKRMAKVAk 
DLNPGVKKMS LGQLQSARG VACLGC KGTCSGFEPHS WRKI CKS C 
KCSQEDHCLTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 
KRNRM I MTNP IATGKDPTFDTITYEWAPPGVTQKLGLQ YMELIP 
KBKQPVTGTEGAFYRRRQLMHQLPIYDQDPSRCRGLLENELKLM 
EEFVKQYKSEALGVGEVALPGQGGLPKEEGKQQEKPEGAETTAA 
TTNGSLSDPSKEVEYVCELCKGAAPPDSPWYSDRAGYNKQWHP 
TCF VCAKCS E PL VDJLi I YFWKDGAP WCGRHYCE S LR PRCSGC DE I 

IFAEDYQRVEDLAWHRKHFVCEGCEQLLSGRAYIVTKGQLLCPT 
CSKSKRS 


4*57 
*658 


83b 

3* ' " 


2120 

855 - ■ - 


lltcqeragd^llsastmkeVVywspkkvadwllenampeycep 
leh ftgqdl inltqe dfkk p plcrvs sdngqrlldm i etlkmeh 
hleahknghanghlnigvdi ptpdgsfs ikikpngmpngyrkem 

I KI PM PELE R SQ YPME WGKTFLAFL YALS CFVLTTVM I S WHER 
VPPKE VO P PLPDTFFDH FNR VQWAFS I CEINGM ILVGLWLI QWL 
LLKYKS I ISRRFFCI VGTLYLYRCITMYVTTLPVPGMHFNCS PK 
LFGDWEAQLRRIMKLIAGGGLSITGSHNMCGDYLYSGHTVMLTL • 
TYLFIKEYS PRRLW WYHWI CWLLS WG I FCI LLAHDH YTVD WV 
AYYITTRLFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 
NVQG I V PRS YHWPFP W P WHLS RQVK YS RLVNDT 
HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

* C31UU6 Oi 

amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C*Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=»Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\»possible nucleotide insertion) 








FDPVPVKQEAMDPVSVSxPSNYMESMKPNKYGVIYSTPLPEKFF 

QTPBGLSHGIQMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 

SPGLSMPSSSPPIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 

IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 

MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNLHSLPGL 
RGLTS ISRNQLQCTNAMRVINNYQRRWKNQNTFLLATFANWNV 
OGNPT I TC PHNRTLNNCHHSG VQ VPLM YCN LTTPS PQN I SN CR Y 
AQTPANMFYIVACDNRDQRRDPPQYPWPVHLHTII 


6660 


CIA 


1707 


CAASLDCRHHIXEPDMXLVWPSAKIjIjQAAAGASARACDSVTSNV 
LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 
NGFKDQLCS LVFMALTDPS TQLQLVG I RTLT VLGAQP DLLS YED 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VPKLAEELRVGBSNLTNGDEPTQCSRHLCCLQALSAVSTHPSIV 
KETLPLLLQHLWQVNRGNMVAQSSDVIAVCQSLRQMAEKCQQDP 
ES CW YFHQTAI PCLLALAVQASMPE KE PS VLR KVLLEDE VLAAM 
VSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSRF 
Q P FQDGS SGQRRLI ALLMAFVCS LPRNVS EH I WE VLLFNLDK VT 
PG 


6661 


179 


430 


GVHAASGTLSATWIAEAKMFDSLAXAGKYLGQAAKLM'IGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


R S L P KPAPAQPAS 1 H CAR FSGVT P PTAKTAMSDGlJ^AFNALMYC" 
GPKADDGNI FSACAPASSAVKAS VS VAQPGQAVI P 


6663 
^"6664 


3 


1005 


R P VLS S R VDD FVP PLP ETS GRR KKLER MYS VDR VSDD I P I RTW F 
PKENLFSFQTAS TTMQAISNFRKHLRMVGSRR VKAQTFAERRER 
S FS RS WSD PTPMKADTS HD SRDS S DLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVMLISSKVPKAEYIPTIIRRDDPSII 
P IL YDHEHATFED I LEE I ERKLNVYHKGAKI WKMLIFCQGGPGH 
LYLLKNKVATFAKVEKEEDMIHFWKRLSRLMSKVNPEPNVIHIM 
G CY I LGN PNGEKL FQNLRTLMTP YR VTFBS PLE LSAQGKQMI E T 
Y FDFRL YRLWKSRQHS KLLDFDDVL 






968 


PRLLRLPR S WVMDS PWDE LALAFSRTSMFP F FD I AHYL VS VMA 
VKRQ PGAAALAWKNP IS S WFTAMLHC FGGG I LS CLLLAE P PLKF 
LANHTNILLAS S I WY IT FFCPHDLVSQGYSYLPVQLLASGMKEV 
TRTWKIVGGVTKANSYYKNGWIVMIAIGWARGAGGTIITNFERL 
VKGDWKPEGDE WL KMS Y PAKVTLLG S VI FTFQRTQHLAI S KHNL 
MFLYTIFIVATKITMMTTQTSTNTFAPFEDTLSWMLFGWQQPFS 
SCEKKSEAKS PSNGVGSLAS KPVDVAS DNVKKKHTKKNE 


6665 


171 


1278 


DERRLACRQVVTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLG PGCTQNPCS VHTATGPEPRKLPLL PPDS PNSG YP KE PA 
ALCPGI PSPCRMTHQDLS ITAKLXNGGVAGLVGVTCVFPIDLAK 
TRLQNQHGKAMYKGM I DCLMKTARAEGFFGM YRGAAVNLTL VTP 
EKAI KLAANDFFRRLLMEDGMQRNLKMEMIiAGCGAGMCQVVVTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 
ATLI AWELLRTQGLAGL YRGLGATLLRDIPFS 1 I YFPLFANLNN 
LGFNELAGKAS FAHS FVSGCVAGS I AAVAVTPLDVLKTR T ott ,v 
KGLGEDMYSGITDCAR 


6666 


498 


2860 


MTTFLP VPQMMAGFS FGTFGNPPMES PSAWQT IHQP F I VS CLTL 
WSPGCWPQPIQKEGVGLWDIRKPQSSLLRYGGNLSLQSAMSVRF 
NSNGTQLLALRRRLPPVLYDIHSRLPVFQFDNQVYFNSCTMKSC 
CFAGDRDQYILSGSDDFNLYMWRIPADPEAGGIGRWNGAFMVL 
KGHRS I VNQVRFNPHTYMICSSGVEKI IKIWSP YKQPGCTGDLD 
GRI E DDS RCL YTH EE Y ISLVLNSGSGLSHDYANQS VQEDPRMMA 
FFDS L VRRE I EGWS SDS DSDLS ES T I LQLHAG VS E RSGYTDS ES 
SASLPRSPPPTVDESADNAFHLGPLRVTTrNTVASTPPTPTCED 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D»Aspartic Acid, E» 
Glutamic Acid, F* Phenylalanine , G=Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\— possible nucleotide insertion) 








AASRQQRLSAIiRRYQDKRLLALSNBSDSEENVCEVBLDTDLFPR 
PRSPSPEDESSSSSSSSSSEDBEELNERRASTWQRNAMRRRQKT 
TREDKPSAP I KPTNTY I GEDNYD YPQIKVDDLSSSPTS S P ERS T 
STLEIQPSRASPTSDIESVBRKIYKAYKWLRYSYISYSNNKDGE 
TSLVTGEADEGRAGTSHKDNPAPSSSKEACLNIAMAQRNQDLPP 
EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKLMGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETI CANHNNGRLHPRPPHPHNNGQNLGELEW 
A* 5> b FQHS DTDRDNSSLTGTLLHKDCCGSBMACSTPNAGTREDP 
TDTPATDSSRAVHGHSGLKRQRIELSDTDSENSSSEKKLKT 


6667 


171 - 


1310 


ABEVERLAAMRSDSLVPGTHTPPIRRRSKFANLGRIPKPWKWRK 
KKSEKFKHTS AALERKI S MRQSREELIKRG VLKE I YDKDGELS I 
SNEEDSLENGQSLSSSQLSLPALSEMBPVPMPRDPCSYEVLQPS 
DIHDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQ I QHQLQYGSHGQHLPSTTGSL 
PMHPSGCRM I DELNKTLAMTMQRLESS EQRVP CSTSYHS SGLHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSIiYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKN I L PRQTDEERL E LRQQIG TKL 


6668 


714 


358 


TLAVATGPALTLRCHVCTSSSNCKHSWCPASSRFCKTTNTVBP 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKLH 
NAAPTRTALAHSALSLGLALSLLAVILAPSL 


6 6" 69 


459 


1207 


KDEETRKDYDYMLDHPEBY YSHY YHYYSRRLAPKVDVRWI LVS 
VCA I S VFQFFSW WNS YNKA I S YLAT VPKYRI QATE I AKQQGLL K 
KAKEKGKNKKSKEEIRDEEENIIKNIIKSKIDIKGGYQKPQICO 
LIiLFQ 1 1 LAP FH LCS Y I VW YCRW I YNFNI KGKE YGEEERL YI I R 
KSMKMSK3QFDSLEDHQKETFLKRELWIKENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEG PGRLT F VDD 


6670 


184 


594 


VAR I * GEAAKMSSEP PPPYPGGPTAPLLEEKSGAPPTPGRSS PA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM • 
P PGFY P P PG PHP PMG Y YP PG P YTPGP YPG PGGHTATVLVP S GAA 
TTVTV 


6671 


1 


763 


lpaekprsapnmaggrcgpqltallaawiaaVaatagpeeaalp'"' 

PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPSCQQTDSEWEAF 
AKNGEILQISVGKVDVIQEPGLSGRFFVTTIjPAFFHAKDGIFRR 

yrgpgifedlqnyilekkwqsvepltgwkspasltmsgmaglfs 
isgkiwhlhnyftvtlgipawcsyvffviatlvfglsmdlvl*v 
i sqcnwdp p yrhvs * /rpstnlgvhtahtsehlrl 


Ul2 


3 04 


1089 


APGSKP VQ FMD FEGKTS FGMS VFNLSNA I MGSGI LgLaYAKAHT? 
GVI FFLALLLCIALLS S YSIHLLLTCAGIAG I RAYEQLGQRAFG 
«*w*w v va I v iLJjtiW vljAMSS YLFIIKSEJjPLVIGTFLYMDPEG 

dwflkgnlliiivsvliilplalmkhlgylgytsglsltcmlff 
lvsviykkfqlglcyratmkqqwesealvgtpqprdstaavkaq 

MFHS *LTG VLTQ WPI MAFAFV CHP GGAG PS I TELCRAFQAQD 


6673 




1963 


lqiqthhthhgarvthlgshqllanagtmlcrqqsssMapafsq" 

S VTCGP S PC VRKQES ATKCLH IGACGSDLWARGWEQG+ G * GLNV 
WI J CPCVAFHRGARPQAEEGGARWNSLVSSPWrPPNP*HSSIGAE 
NAVPRP * QG + KVN PSGQERQS \ WVLPLP VPGEPLKL PGL PG*NK 
SFSRV/SGSKGKWILPRQLM*AS+R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTQKPRFSDTGW 
FGAGHCHS SCDFTRKGAAGGPG 1 


6674 


1 


440 


LEFDYMCQYDYVEVRDGDNRDGQltKRVCGNERPAPIQSIGSSL 
HVLFHSDGSKNFDGFHAI YEE I TACSSS PCFHDGTCVLDKAGS Y 
KCACLAG YTG QR CENLLE ERNCS DPG/ WPS QW VP ENNRG PWA YQ 
PTPC* IGTRVAFFLT 
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Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, TsThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6675 


i 277 




GNWPTERMA F LDNPT 1 I LAHIRQS HVTSDDTGM CEMVL I DHDVD 
LBKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFB 
KKSLKEKPP I SGKQS ILSVRLEQCPLQLNNPFNEYSKFDGKGKV 
G TTATKKI D VYLPLH S S QDRLLP MT VVTMASARVQDL IGL I CWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GPSTLALVEKYSSPGLTSKESLFVRINAAKGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEBDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDI*LCAC\DLAEE 


6676 





1678 


GKWPTERWAFLDNPTIILAHIRQSHVTSDDTGMCEMVL2DHDVD 
LBK1HPPSMPGDSGSE1QGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAAIiRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFW I KQKP I S IDS DLLCAC\ DLAEE 


6677 


£.11 


1678 


GNWPTERMAFLDNPTI I IAH I RQS HVTSDDTGM CEMVL I DHDVD 
LEKIHPPSMPGDSGSEIQGSNGBTQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKE RQNQI KCKN I QW KERNS KQS AQE LKSL FE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNWPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICMQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I LLKAVKRRXGS QKVSGS RADGVFEEDSQI D I ATVQDMLS3H 
HYKS FKVSM IHRIjRFTTDVQL/GCAL FPGVLRKRAAPVDCLRPS 
ADTWRQEQ IGCCGAACAALRS *DSHKC* EG I SGD KVE I DPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6^78 • 




dec ""' 

ODD 


GPSNQSS3SLSLIVTGCSSYWS*INDTCTILRVLSSNFGRQ*LR 
PFPCSQLPMSQGCLWHLDCCCPWVPYIPGOjQWRKGRQRMRN*QS 
LLGS DQE S VGLEDLCVFVNFLLHVLLGLFP * PHEL FLLP WDLG 
FLFPLLLQGGCHCLVLPANLVSQAPQIGKLSCRLQTHDIiEGSRN 
HHPLFLWGRVIDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRSPGQNWVKTVDGWKRFLDEKSGSFVSDL 
SSYCDJKE VYNKENLFNSLNYD/ SCSQEEKEGHAE *QNQNS \DFH 
QEKWIYVHKGSTKERHGYCTLGEAFNRLDFSTAILDSRRFNYW 
RLLE L I AKSQLTS LSG I AQKNFMN I LEKVVLKVLEDQQNI TLX R 
ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETILHWQQQLN 
NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGLF 


6680 


1498 


2951 


PLCTLPLMPSALPGWAGERWEKQWPliA/ PGPGTWQTPVGS ISEE 
P\RKNEPDTHCPRGEARPEV* HLPKPHSPf3SEGAETOT<!n * AT.D 

/NQVS PPQPM*GAEENGDQRGGKEEAGEELHRSS SGLTAAPGF? 
EVHRNLQTFPGLPSRGGGP/GGAGTQGSWAPGEQPP/SPLLPAS 
MQRSQAGLPGWEAGLVESPTHHIPALRPSGTNATGEAFPSTTCS 
SGP\ PAP PGPTGLRPGGGSSSGGHG * * PGLPVGKV\GALGAAQD 
PQSQGRG PTQGTVGTEMIiLSGLGS AKAC PAARPAVP * L PSD PAS 
TIPKKGTRGFGEGPGVLQERNRWWGRAQGFTSADAAGTAPPGV 
♦LPAPLSQPPGATEPQVRACGMAPPSPGrSGRLVANGRHPGPQV 
AQGC PPGAGCWGSQPRGSQRCPRTYTHS PLGHGRAPCPRRCWH* 
WQDP PSS PRTGCLPGI PARQAYSAPRTRSRPGI RTGRAAYGFIR 
FQGGGGG 



539 



WO 01/53312 



PCT/US00/34263 



ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
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Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H^Histidine, I=lsoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutaraine, R=Arginine, 
S=Serine, T=Threonine, VoValine, 
W»Tryptophan, Y»Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 


6681 


1169 


511 


inyiyynqqqrafhelkVeklmsapalglpdltklftlhvsere ■ 

KMTVGVLTQTVGPWSRPGAYLSKQLDGVSKGWPPCPRALAATAL 

LAQEADELTLRQNLNRKSPHA\WTLINTKGHH*LINARLTRYQ 

TLLCENPHKT1EVSNT/LNPATLLLVTESPVKHNCLEVLDSVYS 

SRPNLRDHP* TS VDWELYVDGSGFANPCKVTLKKETSPAPVTPR 
S 


6682 


109 


1238 


T VLCGAMQVS SLNE VKI YS LSCG KS LPEWLS DRKKRALQKKDVD 
VRRR I EL I QD FEM PT VCTT I KVS KDGQ Y I LATGTYKPR VRCYDT 
YQLS LKFERCLDS E WTFE I LSDD YSKI VFLHNDRYI EFHS QSG 
F Y YKTR I PKFGRD FS YHYPSCDL YFVGASS EVYRLNLEQGR YLN 
PLQTDAABNNVCDINSVHGLFATGTlEGRVECWDPRTRNRVGLIi 
D\AP*TVSO^IQR*TSLPTISALKFN\GALTMAVGTTTGQVLJjY 
DLRSDKPLLVKDHQYGL P I KS VH FQDSLDL II^SADSR I VKMWNK 
NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCS FLDNLTEELEENPESNE 


6663 


109 


1238 


TVLCGAMQVsSLNEVKIYSLSCX3KSLPEWI^DRkKRALQKKDVD~ 
VRRR I EL I QD FEMPTVCTTIKVS KDGQY I LATGTYKPR VRCYDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIBFHSQSG 
FY YKTRI P K FCRDFS YHYPS CDL YFVGAS S EVYRLNLBQGR YLN 
PLQTDAAENNVCDINSVHGIiFATGTJEGRVECWDPRTRNRVGLL 
D\AP*TVSQQ1QR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRS DKPLLVKDHQ YGLP I KS VH FQDS LDL ILS ADSR I V KM WNK 
NSGKI FTSLE PEHDLNDVCLYPNSGMLLTANETPKMGI YYI PVL 
GPAPRWCSFLDNLTEELEENPESNE 


6684 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA" 

PP\SGRRHA*RPA*WLGGPGGDSGGREEGGS/GELQRAMESKMG 

ELPLDINIQEPRWDQSTFLGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


6685 


25B 


14 73 


kllgdnfegfcnkfelsdsengsns^qspijVfdrlfdpdpqkvl 

QG V I DMKNAVI GNNKQKANLI VLGAVPR LLYLLQQETS STELKT 
ECAWLGSLAMGTENNVKSLLDCHI I pallogllspdlkfieac 
LRCLRTIFTSPVTPEELLYTDATVIPHLMALLSRSRYTQEYICQ 

ifshcckgpdhqtxlfnhgavqniahlltslsykvrmqalkcfs 
vlafenpqvsmtlvnvlwgellpqifvkmlqrdkpiemqltsa 

KCLTYMCRAGAIRTDDNC I VLKTLPCLVRMCS KERLLEERVEGA 
ETLAYL I EPD VELQRIAS I TDHLI AMLAD YFKY PSSVSAITD I K 

RLDHDLKHAHELRQAAFKLYASLGANDEDIRKKVSLGEGRPPVL 
TASRQGVTST 


6686 
6687 


310 


927 


DSVTFDDIAVuyTPKEWTLLDPTCRNLYRDVMLENYKNIAtVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTS SGIQM 1GSHNGGE VSD VKQCGDVSSEHSCLKTHVRTQN 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF*LQLTLGKSFH*SIHT 


6^88 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTfSSTSNSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
SRDHRREDRVHYRS PPLATGEPVDNLSPEERDARTVFCMQLAAR 
A.n.mvuaur r o>ivt»R.VKUVK J. 1SDRNSRRSKGIAYVEFCEIQSV 
PLAIGLTGQRLLGVP 1 I VQASQAEKNRLAAMANNLQKGNGGPMR 
LYVGSLHFNITEDMLRGIFEPFGKV 




1025 


1 


A^VPDnrPRVFHKCPDSCWRFKFQP^QLQPYlUS^^EKPPI^F 
SEPGLPR/ SATARMATAAAPPNSS IDLPSDSGMGFI SPAGDSLD 
LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVI CGS KGAGASGSASCSSRAGKTTEATAASSMPSGTSS FSTC 
TMSELEELFSLFSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 
VCQLWLADS DTG KLS DCQE WTVGDS GGLTC PELSLGRM * MS LL 



540 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor r c spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
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L=Leucine, M=Methionine, N=Asparagine, 
P=Proline,' Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








SSAVIPGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
Q RNS LTARQLAMS h * ATKF * RNACNPNCLS S KKSAL* LS LNQR F 
GGS AS RKPGN I SFNSQKCSALSYCCNFV I KPREVSVSS EN YPAF 


6690 


1 


442 


GTRGKMAATLG PLGS WQQWRRCLSARDGSRMLUjLLLLGSGQGP 
QQVGAGOTFEYLKREHSLSKPYQGVGTGSSSLWNLMGNAMVMTQ 
YI RLTPDMQS KQGALWNRVPCFLRDWELQVHFXIHGQGKKNL\H 
GDGLAIWYTKDRMQP 


" 6*91 


287 


1401 


lktetseekarrykdrpsqlnavfqeqkkmiqaqesitledvav 
dftweewqllgaaqkdlyrdvmlenysnlvavgyqaskpdalfk 
leqgeqlwtiedgihsgacsdiwkvdhvlbrlqseslvnrrkpc 
hehdafenlvhcsksqfllgqnhdifdiirgksiiksnltlvnqsk 

GYEI KNSVEFTGNGDS FLHANHERLHTAI KFPASQKLISTKSQF 
ISPKHQKTRKLEKHHVCSECGKAFIKKSWLTDHQVMHTGEKPHR 
CSLCEKAPSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKP Y I CSECGKGFIQKGNLI VHQR I HTGEKP Y I CNEC 
/G KG F IQKT CL IAHQR FHTER 


6692" 


178 

• 


939 


WI KEGELS LWERFCAN I 1 KAGPMPKHIAFIMDGNRRYAXKCQVE 
RQEGHSQGFNKLAETLRWCLNLGILEVTVYAFS IENFKRS KSEV 
DGLMDLARQKFSRLMEEKEKLQKHGVCIRVLGDLHLLPLDLQEL 
IAQAVQATKNYNKCFLNVCFAYTSRHE I SNAVREMAWGVEQGLL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WIKEGELSIiWERFCANI I KAGPMPKH IAF IMDGNRR YAKKCQ VE " 
RQEGHSQGFNKLAETLRWCLNLGILEVTVYAFSIENFKRSKSEV 

dglmdlarqkfsrlmeekeklqkhgvcirvlgdlhllpldlqel 
iaqavqatknynkcflnvcfaytsrhbisnavremawgveqgll 
dpsd i seslldkclytnrs phpdi l i rtsgevrlsdfllwqtsh 

SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASS VRRQAAESRQHELPVR " " 
E VHS LGQ I LPQDGLTAEAG P PE AQDP WG S PG ISL PAAH I G FAAA 
LAVG PS GCHTE P \ FDE VW P S LFLGDAYAARDKS KLI QLG I THW 
NAAAGKFQVDTGAKFYRGMSIiEYYGIEADDNPFFDLSVYFLP 


6695 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
E VHSLGQI LPQDGLTAEAGPPEAQDPWGSPGISLPAAH I GFAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHW 
NAAAGKFQVDTGAKFYRGMSLE YYG I EADDNPFFDLS VYFLP 


^696- 


1 


782 


PRVRGRVGERWAFLSVPAAMSSEMEPLLLAWSYFRRRKFQLCAD 
LCTQMLEKSPYDQAAWILKARAtjTEMVYIDEIDVDQEGIAEMML 
DENA I AC; V PR PGTSLKLPGTNQTGG PSQ AVRP ITQAGRP I TG FL 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 
TSPDGPFINLSRLNLTKYSQKPKLAKALIEYIFHHENDVKTALD 
LAALSTEHSQYKDWWWK/ DQ I EKC YYRVGM YREAE KQ I kss 


6697 


3 


782 


PPLFLRRLNSRALRPGSRKVMAWPASLSGQDVGSFAYLTIKDR 
IPQILTKVIDTLHRHKSEFFEKHGEEGVEAEKKAISLLSKLRNE 
LQTDKPFIPLVEKFVDTDIWNQYLEYQQSLLNESDGKSRWFYSP 
WLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQISLWGEISVDL 
SL\SGGESSSQNTNVLNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 




754 


vgscacagsckckeckctsckksecrafp 5 


6699 


325 


492 


egelp/parrvlpramtasaqprgrrpgvgvgwvtsckhprcv 
llgkrkgsvgagsfqlpgghlepgetweecaqretweeaalhlk 
nvhfas wnsfi ekenyhyvti lmkgevdvthdsepknvepekn 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F- Phenylalanine , G=Glycine f 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








ESKRI I YNHAFFFQESKWSGGILQ 


6700 


1098 


1392 


TQCWRS S T PGMRTHFRTQ P / RLECGQ G FSQQENGHCMDTN ECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYBPNEDGTACVERT 
LLLGLCNLLGK 


"6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPS PSPGLG PTSDKAAAPRTPKRRRLW 
RQRQ/HPAMLCYVTRPDAVI^IEVEVEAKANGEDCLNQVCRRLGI 
IEVDY FGLQFTGSKGESLWLNLRNRI SQQMDGLAPYRLKLRVKF 
FVE PHL I LQ EQ TRH I FFLH I KEALLAGHUjCS PEQA VE LS ALLA 
QTKFGDYNQNTAKYNYEELCAKELSSATLNSIVAXHKELEGTSQ 
AS AE YQVLQ I VS AM EN YG I EWHS VRD S EGQ KLI* I GVGP EG I S I C 
KDDFS P INRIAYPWQMATQSGKNV YLTVTKESGNS IVLLFKM I 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYSRDLKGHIiASLF 
LNEN I NLG KKYVFD I KRTS KEVYDHARRAL YNAG WDLVS RNNQ 
S PSHS PLKSSES SMNCSSCEGLSCQQTRVLQEKLRKUCEAMLCM 
VCCEEE INSTFCPCX3HTVCCESCAAQLQVGESAAHFCLQPHLS L 
LLTGSRSQVLAR 


€702 


397 


1971 


PLAKFLKriDLVNVLCLPMEDVFLFYRTeyC^MGi^SsCHLSLPK 
RAEALL CSRKATVVRDLVAVRMAEEOE FTOLCICLP AO P.qupurv 
NNTYRS AQHSQALLRGLLALRDSG ILFD WLWEGRH I EAHR I L 
LAASCDYFKGMFAGGLKEMEQEEVLIHGV3YNAMCQILHFIYTS 
ELELSLSNVQETLVAACQLQ I PEI IH FCCD FLMS W VD EEN I LDV 
YRLAELFDLSRLTEQLDTYI liKNFVAFSRTDKYRQLPLE KVYSL 
LSSNRLEVSCETEVYEGALLYHYSLEQVQADQI S LHE PP KLLET 
VR FPLM EAE VLQRLHD KLDP S PLRDTVAS ALMYHRNES LQ P S LQ 
SPQTELRSDPQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTASLAPRMSNQGIAVLNNFVYLIGGDNNVQGFRAESRCWRYDP 
RHNRWFQI QSLQQEHADLS VC WGRY I YAVAGRD YHNDLNAVER 
YDPATNSWAYVAPLKREVYAHAGATIjEGKMYITCGRKGRIT 


6703 


45 


1244 


GVGPRAAAMPLELELCPGRWVGGQHPCFIIAE1GQNHQGDLDVA " 

KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 

YGEHKRHLEFSHDQYRELiQRYAEEVGIFFTASGMDEMAVEFLHE 

LNVPFFKVGSGDTNNFPYLEKTAK/TRGWHSVLRDVCGVQLNDE 

TSSWDVLGRVRTSKEKVLMVLVLDYSGRPMVISSGMQSMDTMKQ 

VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 

IGYSGHETGIAISVAAVALGAKVLERHITLDKTWKGSDHSASLE 

PGELAEXjVRSVRLVERALGSPTKQLLPCEMACNEKLGKSVVAKV 

KI PEGTI LTMDMLTVKVGE P KG YP P ED I FNWGKKVLVT VEEDD 

TIMEE 


6704 


82 


1007 


TMNTRNRVVNSGI^ASPASRP^RDPQDPSGRQGELSPVEDQREG^ 
LEAAPKGPSRESWHAGQRRTSAYTLIAPNINRRNBIQRIABQE 
LANL E KWKE QNRAKP VHL VP RRLGGSQ SETEVRQKQQLQLMQS K 
YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQYKTAEFIi/RQTEHRIARQKCLSKCCLWPTILN 
MGQ KLG LQ \ DSLKAEENRKLQKM KDEQHQ KS ELLELKRQQQEQE 

RAKIHQTEHRRVNNAFLDRLQGKSQPGGLEQSGGCWNMNSGNSW 
GI 




2 


186 


rlcrnsarvpcgwsasrsLgegagfigplrgphpraggtgtsft 

S Y KRKGG I MS T I AAFYGGKS I Ii I TVATGFLGKELM EKLFRTSPD 

lkviyilvrpkagqtlqhrvfqildsklfekvievrpnvhekir 
aiyadlnqndfaiskedmqellsctniifhcaatvrfddtlrha 
vqlnvtatrqlllmasqmpkleafihi staysncnlkh 1 devi y 
pcpvepkkiidslew\lddaiideitpklirdwpniytytk 


6706 


130 


531 


PTHSSSSHSQEMLGKLNMLRNDGHFCDtTlRVQDKIFRAHKVVL 
AACS DF FRTKLVGQ AEDENKNVLDLHHVTVTGF I PLLE YAYTAT 
LSINTENI IDVLAAASYMQMFSVASTCSEFMKSSILWNTPNSQP 
EK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, DoAspartic Acid, E« 
Glutamic Acid, P-Phcnylalanine, G«Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
o-aetme, i = i.nreonine, v=valine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYEIiQHFHWRKFHFEKKGPPSTCQBRLYESRSRWPCIS* 
GMVWGWTAVNGSW*GGQLRCVCVCTSHSSDSTRSSQRASKCHS 
FFILSQ*KT*SSWENWVFAKYSRIYSYGHSCSKGRGD*DFK*NV 
SQAR*SRFCGLCNPCGHCGLDINLRGGSSPWTDKHSCVHNNLLC 
NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYA 
C*RCHWYFEWLLYNHCGDILVACL+RRQL*SSQ 


6708 


11S . 


1729 


TVGSWSRSGRSPPVGRQLLLTGRGAQAAGSPQGGMALQVELVPT 
GEI I RVVHPHRPCKLALGSDGVR VTMESALTAR DR VGVQDF VLL 
ENFTS EAAFI ENLRRRFRENLI YTYIGPVLVS VNPYRDLQI YSR 
QHMERYRGVSFYEEPPHIiLAVADTVYRALRTERRDQAVMISVES 
GAGKTDATXRLLQLYAETCPAPQRGGAVRDRL LQSN P VLEAFGN 
AKTLRNDNSSRFGKYMDVQFDFKGAPVGGKILSYLLEKSRWHQ 
NHGERNFH I F YQtXEGGEEETLRRLGLERNPQS YLYLVKGQCAK 
VSSINDKSDWKWRKALTVIDFTEDEVEDLLSIAASVIiHLGNIH 
FAANEESNAQVTTEKQLKYLTRLLSVEGSTLREALTHRKIIAKG 
BELLS PLNLEQAAYARDALAKAVYS RTFTWLVGK I NRS LAS KD V 
ESPSWRSTTVLGLLDIYGFEVFQHNSFEQFCINYCNEKLQQLFI 
ELTLKSEQEEYEAEGIAWEPVQYFNNKIICDLVEEKFKGII\SI 
LDE\ECLRPGE 


6709 


3 


854 * 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK"" 
TAAKMEKKVS KRS RKEE EDLEAL IAHFQTLDAKRTQT VE L P CP P 
PSPRLNASLSVHPEKDELILFGGEYFNGQKTFLYNELYVYNIRK 
DTWTKVD 1 PSPPPRRCAHQAVWPQGGGQLWVFGGEFAS PNGEQ 
FYHYKDLWVLHLATKTWEQVKSTGGPSGRSGHRMVAWKRQLILF 
GGFHESTRDYIYYNDVYAFNLDTFTWSKLSPSGTGPTPRSGCQ\ 
I PSLPRAASS VYGGYS KQRVKKDVDKGTRHSDMF 


6710 


158 


980 


RHKMTNYRVESSSGRAARKMRLALMGPAFIAAIGYIDPGNFATN 
IQAGAS FG YQLLWVWWANLMAML I Q I LSAKLG I ATG KNLAEQI 
RDHYPRPVWFYWVQABI IAMATDLAEFIGAAIGFKLILGVSLL 
QGAVLTGIATFLI LMLQRRGQKPLEKVIGGLLLFVAAAY I VELI 
FSQPNLAQLGKGM VI PS L PTSE AVFLAAG VL \ GAT IMPHVI / YI 
WHSS LTQHLHGGS RQQRYS ATKWD VAI AMTI AGFVN LA X MATAA 
SELNFYGHTGVA 


6711 


3 


347 


VTECKTMTCKM S Q LERN I * TM INTLHH YS VKLGHPDTL I HGEFK ' 

elvrtdlhnilm:<enkndqai*himedldtnahmqiifkeliml 
mamltwsyhdnmhdadygpgqqhrpg 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRLPPGENIDDWIAVHV 
VDFFNRINLIYGTMAERCS*TSCPVMAGGPRYEYRWQDERQYRR 
PAKLSAPRYMALLMDWI ESLI 


6713 


2<t85 


3 


QARGSDSEIX3EFEIQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTR EMVRAQNKKKKKSGG FQSMGLS YP VFKG I MKKG YKVPTP I 
QRKTIPVILDGKDWAMARTGSGKTACFLLPMFERLKTHSAQTG 
ARALILSPTRELALQTLKFTKELGKFTGLKTALILGGDRMEDQF 
AALHENPDI I IATPGRLVHVAVEMSLKLQSVEYWFDEADRLFE 
MGFAEQLQEIIARJ.PGGHQTVLFSATLPKLLVEFARAGLTEPVL 
IRLDVDTKLKEQLKTS FFLVREDTKAAVLLHLLHNWRPQDQTV 
VFVATKHHAEYLTELLTTQRVSCAHI YSALDPTARKINLAKFTL 
GKCSTLI VTDLAARGLD I PLLDNV1NYS FPAKGKL FLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQSWDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VDS I KNYRSRATI FEINAS SRDLCSQVMRAKRQ KDRKAI ARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVEDIFS 
EWGRKRQRSGPWRGAKRRREEARQRDQEFYIPYRPKOFDSERG 
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ID 

NO: 


beginning 

nucleotide 

location 

»ui j. c oponaiiig 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re spon d i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Ammo acid segment containing signal peptide " 
{A-Alanine, C=Cysteine, D=>Aspartic Acid, E«= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, '-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


fi"7l A 


Tea ' 




LSISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRQQLKWDRKKKR 
FVGQSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 
GRRRG1LTRRRPRTBEVGEARPLAQAGCIPGPHAPRHPLQAESA 
LELKTKQQ I LKQRRRAQKAALS LQRWWPQAALCPQ 


O / XH 


169 


1416 


NN CQELL P P P P APMAH I PS GGAPAAGAAPMG PQ Y CVC KV E L S VS 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
SKKFVLDYHFEEVQKLKFALFDQDKSSMRLDEHDFLGQFSCSLG 
TIVSSKKITRPLLLLNDKPAGKGLrTIAAQELSDNRVITLSLAG 
RRLDKKDLFGKSDPFLEFYKPGDDGKWMLVHRTEVIKYTLDPVW 
KP FT VPLVS LCDGDME KP I Q VMCYD YDNDGGHD FIGE FQTS VSQ 
MCEARDSVPLEFECINPKKQRKKKNYKNSGI 1 ILRSCKINRD YS 
FLDYILGGCQLMFTVGIDFTASNGNPLDPSSLHYINPMGTNEYL 
S AI WAVGQ 1 1 QDYDS D KMFPAXjG FG AQLP PDW KVSHE FAINFNP 
TNPFCSGVDGIAQAYSACLP 


6715 


32 


493 


GPAGAESGSLHCLPATVQAXiAGAAHSPHGGQPPRRGPLIGSGMP 
GKPKHU3VPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLSIHSL 
PSGPSSPFPTEEQPVASWALSFERHiQDPLGLAYFTEFLKKEFS 
AENVTFWKACERFQQI PASDT 


6716 


1 


1^76 


GAGG PAPRS FGS E E PRAALERDKMS ARAAAAKS TAME ETAI WEQ 
HTVTLIIRVSLCC3K 


6717 


115 


696 


LFAMSGFENLNTDFYQTS YS I DDQSQQS YDYGGSGGPYS KQ YAG 
YDYS OQGRFVP PDMMQPQQ P YTGQ I Y Q PTQAYTPAS PQPFYGNN 

FEDEPPLLEELGINFDHIWQKTLTVLHPLKVADGSIMNETDLAG 
PMVFCIAFGATLLLAGKIQFGYVYGI SAIGCLGMFCLLNLMSMT 
GVSFGCVASVLGYCIiLPMILLSSFAVIFSLQGMVGI ILTAG1 IG 
WCSFSASKIFISALAMEGQQLLVAYPCALLYGVFALISVP 


671B 


290 


599 


KUbaTVPGTILPSLKWHNSGLCKFPETGGKMTTFKEGljTFKDVA 
VIFTEEELGLLDPVQRNIjYQDVMLENFRNIJjSVGHHPFKHDVFL 
LEKEKKLDIMKTATQ 


6719 
d720 


1 


691 


Pl'RPESQDREDGKCHKMEMNPISGNtNCDPIAMSQCSSDHGCET 
DLDSDDDKIEKPNNFMKDSASQDNGLSRKISRKRVCSSDSDSSL 
QWKKSS KARTG LLRI TRRCAATAAN KI KLMS D VED VSLENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDSEGSTKVLSQALNGDSDS EDMLNS EHKHRHTNIHKIDAPS K 
RKSSSVTSSG 




3 


B22 


HEVAEEAGGTVYPQRGTMPGTKRFQHVI ETPE" PG KWEJjTG YEAA 
VPITEKSNPLTQDLDKADAENI VRLLGQCDAE I FQEEGQALST Y 
QRLYSES I LTTMVQVAGKVQE VLKEPDGGLWLSGGGTSGRMAF 
LMS VS FNQLM KGLGQKPLYT YL I AGGDRS WAS R EGTEDS ALHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VLVG FNP VSMARH PFP PP R I LRSI/TVFP S LRAPH YQ I TSIiL FS M 
SWTLISE 


6721 


3 


822 


HBVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPG'KWELTGYEAA 
VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFQEEGQALSTY 
QRLYSESILTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGOKPLYTYL1AGGDRS WASRFRTPnQfcT un 
I E ELKKVAAGKKRVI VI GI S VGLS AP FVAGQMDCCMNNTAVFL P 

VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYOirSL^FSM 
SWTLISE 


6722 
^23" - 


1 


390 


rswskrtwqalpmavlflljLflcgtpqaadnmqaiyvalgeave 

L P C P S P S TLHGDEHLS W FCS PAAGS FTTL VAQ VQ VGRPAPDPG K 
PGRESRLRLLGNYSLWLEGSKEEDAGRYWCAVbGQHHNYQNW 




173 


659 


VCQYCTARMAD PG I SAGQFVAWWDKSS P VEALKGLVDKLQAIiT 
GNEGRVS VENI KQLLQSAHKESSFDI ILSGLVPGSTTLHSAEIL 
AE I AR I LRPGGCLFLKEP VETAVDNNS KVKTAS KLCS ALTLSGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ami no ariri 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


■ — ab^ui^tib *»viitciniitig oignax peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leuclne, M=Methionine, N»Asparagine , 
P^Proline, Q=Glutamine, RsaArginine, 
SaSerine, T=Threonine, V«Valine, 
W=Tryptophan, Y»Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
VE VKE LQRE PLTPEE VQS VRBHLGHESDNL 


6724 


173 


ass 


VCQ YCTARMADFG I S AGQFVAWWDfCSS PVBALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AHKESSFDI ILSGLVPGSTTLHS AEI L 
AE X AR I LRPGGCL PL KE PVETAVDNNSKVKTAS KLCS ALTLSGL 
VEVKELQREPLTPEEVQSVREHLGHESDNL 


6725 


356 


722 


RRRTPPVILATMDDDLMLALRLQEEWNLQEAERDHAQESLSLVD 
AS WELVDP TPDLQALFVQFNDQPFWGQLEAVEVKWS VRMTLCAG 
I CS YEGKGGMCS IRLSEPLLKLR PRKDL VE VFFV 


6726 


98 


714 


HLQKMERKINRREKBKEYEGKHNSLEDTDQGKNCKSTLMTLNVG 
G YL YITQKQTLT KY PDTFLEG I VNGKI LCP FDADGHY F I DRJDGL 
LFRHVLNFLRNGELLLPEGFRENQLLAQEAEFFX2LKGLAEEVKS 
RWE KEQLTPRETTFLE I TDNHDRS QGLR I FCNAPDF I S KI KSR I 
VLVS KSRLDGFPEEFS ISSNI IQFKYFIK 


6727 
6728 


1 


831 


FRGMGDERPHYYGKHGTPQKYDPTPKGPivWRCiCTDiiCCVFLL 
LA I VG YVAVG I IAWTHGDPRKVI YPTDSRGE FCGQKGTKNENKP 
YLFYFN1VKCASPLVLLEFQCPTPQICVEKCPDRYLTYLNARSS 
RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVLIPSKPLARRCF 
PAIHA YKG VLMVGNETTYEDGHGS RKN I TDL VEGAKKANG VLEA 

RQLAMRIFEDYTVSWYWDI ISLGIAMAMSLLFI ILLRFLAGIMG 
RGMIIMGILVLGY 


~~S7~2§ — 


486 


93 S 


FCSSWLRSLADSSI^WKMFLVGLTGGIASGKSSVidVF^QLGCA 
VI DVDVMARHWQ PG YPAHRRI VEVFGTEVLLENGDINRKVLGD 
LI FNQ P DRRQLLNAI THPE I R KEMMKETFKYFLRE PRTS PRGKK 
HVPS ALKEADS LMRRDT 




259 


1191 


VGLTGAQSGRTASMGRDQRAVAGPALRRWLUJGTVTVGFIiAQSV 
LAGVKKFDVPCGGRDCSGGCQCYPEKGGRGQ PG P VG PQG YNG P P 
GLQG F PGLOGRKGDKGERGAPGVTG P KGDVGARG VSGFPGADG I 
PGHPGQGGPRGRPGYDGCNGTQGDSGPQGPPGSEGFTGPPGPQG 
PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 
GP VGAPGRPGPPG PPGPXGQQGNRGLG FYGVKG B KGDVGQ PGPN 

GIPSDTLHPHAPTGVTFHPDQYKGEKGSEQEPGIRGISLKGEE 
GIM 


6730 


784 


1015 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEAYEVLSNDEKRDIYDKYGTEGLNEF 


6731 


1 


446 


GiRIWLHGAWPRVEVGCPWETRSSEGVHLERPTSPLKNNDEGS 
LDI YAGLDSAVSDSASKS CVPSRNCLDLYEE ILTEEGTAKEATY 

NDLQVEYGKCQLQMKELMKKFKEIQTQNFSLINENQSLKKNISA 
LIKTARVE INRKDE E I 


6732 
6733 


102 


1205 


GRWQRRPPPpsPPLWCLQPGGGSDPQQIiTQLRHCLSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LGSIKPSSRSTKATSTTMAGDGRRAEAVRBGWGVYVTPRAPIRE 
GRGRLAPQNGGSSDAPAYRTPPSRQGRREVRFSDEPPEVYGDFE 
PLVAKERS P VG KRTR LEEFRS DS AKEE VRESAYYLRS RQRRQPR 

PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 
DEASSQTDLSQT2SKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 
YEATS VQQKVNFSEEGETEEDDOD9 <? ^VTTV/wnp c onenpcn 

DKTTRSSSQYIESFW 


6734 


*13 


nrr 


RSCRQVGMRSRNQGGESASDGHISCPKPSltGNAGEKSLSEDAK 
KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLI 
QLLSIMEGELQAREDVIHMLKTEKTKPEVLEAHYGSAEPEKVLR 
VLHRDAILAQEKSIGEDVYEKPISELDRLEEKQKETYRRMLEQL 
LLAE KCHRR T VYELEWE KHKHTD YMNKS DD F TNLL EQ ERE RL KK 
LLEQEKAYQARKE 




189 " ~ 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAVHF^WEEWQDLD^ 
RQRTLYRDVMLETY5SLV5LGHCITKPEMIFKLEQGAEPWIVEK 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corre s ponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


" u e»v*j.vi uic in. containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K»Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine / R=Arginine, 
S=Serine, TaThreonine, V=Valine, 
W=Tryptophan, Y-Tyroeine, X= Unknown, *»Stop 
Codon, 7-poBsible nucleotide deletion, 
\«possible nucleotide insertion) 
TLNLRLSGGSKKQVFSGICHRSLVELQEVHLV " "~| 


6735 


280 


558 


ksrragvtkmsnpplkqvfnkdktfrpkrkfepgtqrpblhkkA' 
qas lnagldlrlavqlppgedlndwvavkwdffnr vnli ygt i 

XDGCT 


6736 


195 


808 


MNYELNFKREMPNIKSLGLTNLNFUiKRLSSVLPLITDYVYFEN 
SSSNPYLIRRIEELNKTASGNVEAKWCPYRRRDISNTLIMLAD 
KHAKE I E EES ETTVEADLTDKQ KHQLKHRELFLSRQ YES LP ATH 
IRGKCSVALIjNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
BIRVGPRYQADIPEMLLEGTFFCVFAVL 


6737 


ISO 

• 


1209 


PVIMPLHFS PGbiVRPS CC VSSS PKLRRNAHSRLiES YRPDTDLS 
REDTGCNLQHISDRENIDDLNMEFNPSDHPRASriFLSKSQTDV 
REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAI YYH I KNRDPDGRMLLDI FDENLHPLSKSE VPPDYDKHN 
PEQKQIYRFVRTLFSAAQLTAECAIVTLVYLERLLTYAEIDICP 
AN W KR I VLG A I LLAS KVWDDQAVWNVD YCQ I UCDITVEDMMELE 
RQFLEIMFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
JiAH KLEAI S R LCE DKYKDLRRS AR KRSASADNLTL PR WS PAI I S 


6738 


148 


653 


CACAEQPARAE VGAATALPVRWASGEMAPSGS LAVPIiAVLVLLL 
WGAPWTHGRRSNVRVITDENWRBLLEGDWMIEFYAPWCPACQNIj 

qpewesfaewgedlevniakvdvteqpglsgrfiitalptiyhc 
kdgefrryqgprtkkdfinfisdkewksiepvsswf 


o / 


3 


631 


swpdmaeekvaklhkhlmllrqeyvkuj^laetekrcalCa^^ 

ankesssesfisrllaivadlyeqeqysdlkikvgdrhisahkf 

v^aarsdswslanlsstkeldlsdanpevtmtmlrwiytdelef 

reddvfltei^klanrfqlqllrercekgvmslvnvrncirfyq 

taeelnastlmnycaeiiashwvsevegvnkal 


6740 


3 
— 


631 


SWPDMAEEEVAKLEKHimLRQEWKLQKKlAETEKRCALI^ 
ANKESSSESFISRLLA1VADLYEQEQYSDLKIKVGDRHISAHKF 

vlaarsdswslanlsstkeldlsdanpevtmtmlrwiytdelef 

REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 

taeelnastlmnycaeiiashwvsevegvnkal 


6741 




9^0 


PLTLPFSSRARAGHTMNTSPGTVGSDPVlLATAGYDHTVRFWQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNPIISYDGVNKNXASVGFHEDGRWMYTGGEDCTARI 
WDLR SRNLQCQR I FQVNAP I NCVCLH PNQAEL I VGDQSGA I H I W 
DLKTDH NEQL I PEP E VS I TS AHI DPDAS YMAAVNS TLVP FS CLL 

PLiAIGILQEGEFESLARRGIiLFLACQGNCYVWNLTGGIGDEVTQ 
LIPKTKIP 


*742 


141 " ■ 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA" 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNPI I S YDGVNKNIAS VGFHEDGR WMYTGGEDCTAR I 
WDLRSRNLQCQR I FQVNAP INCVCLH PNQAEL I VGDQ SGA I H I W 
DLKTDHNEQL I PE P E VS I TS AH I DPDAS YMAAVNSTLVPFS CLL 
PLAI G I LQEGE FES LARRGLLFLACQGNCYVWNLTGG I GDE VTQ 
LIPKTKIP 


6743 


1 


ra ■ 


ALRKCSDLEKAIATTALIFRNSSDSDGKLEKAIAKDLLQTQFRN 

FAEGQETKPKYREILSELDEHTENKLDFEDFMILLLSITVMSDL 
LQNIR 1 


6744 


95 


1343 • 


RTPARNR CAGCE VLS R FSS PNKASSFALQSAGGGL PA VRALRRD 
RQKVSTVG YGMDEVEQD QHBARL KEL FDS FDTTGTGS LGQEELT 
DLCHML S L EE VAP VLQQ TLLQDNLLGR VH FDQFKEAL I LI LSRT 
LSNEEHFQE PDCS L E AQPKYVRGGKR YGRRS LPE FQES VEEFPE 
VTVIEPLDEEARPSHIPAGDCSEHWKTQRSEEYEAEGQLRFWNP 
DDLNASQSGS S PPQD W I EB KLQE VCEDLG ITRDGHLNR KKLVS I 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, " 
L=Leucine, MsMethionine, N*Asparagine, 
PsProline, Q°Glut amine, R=Arginine, 
S=Serine, T«Threonine, V= Valine, 
W-Tryptophan, Y«Tyrosine, X«Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CEQYGLQN VDGEMLEEVFHNLDPDGTMS VEDFFYGLFKNGKS LT" ~ 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERILDTWQEEGIENSQEILKALDFGLDGNINLTEL 
TLAIiENELLVTKNS I HQACI 


6745 


1 


588 


TFRDQGWAQRRRWLLGCASWESWEAAIAAGPGliPSSTARQQNNP ~ 
AAGTECFAAVWARGTAMGS VLS TDSG KS APAS ATARALERRRDP 
ELPVTSFDCAVCLEVLHQPVRTRCGHVFCRSC1ATSLKNNKWTC 
P YCRAYL PS EGTOATD VAKRMKS E YKNCAECDTLVCLS EMRAH I 
RTCQKYIDKYGPLQELEETA 


6746 


r 110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFQTTEI 
S LWTWAAI QAVEKKMESQAARLQSLEGRTGTAE KKLAD CE KMA 
VEFGNQLEGKWAVLGTLLQEYGLLQRRLENVENLLRNRN 


6747 


247 


484 


EAVTFKD VA WFTE E E LGLLDLAQRKLYRD VMLENFRNL LS VGH 
QPFHRDTFHFLREEKFWMMD IATQREGNS VYAGVC 


6748 


201 


665 


mttfkeavtfkovawfteeelgi*ldpaqrklyrdvmLenfrnl 

LS VGNQP FHQDTFHFLG KE KFWKMKTTS QREGNSGG KI Q I E MET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFFKEGDVPC 
Q I EARLS I SXVQQXP YRCNECKQ 


6749 


95 


719 


RREVKGGDGVCPRARGSPQSQQFPSCAGGGEGLQQSGEALDGAM " 
SAGGPCPAAAGGGPGGASCS VGAPGGVSMFRWLBVLE KEFDKAF 
VDVDLLLGE I DPDQADITYEGRQKMTSLSSCFAQLCHKAQS VSQ 
INHKLEAQLVDLKSELTEl'QAEKWLEKEVHDQLLQLHSIQLQL 
HAKTGQS ADS GTI KAKLSG PS VEE LE RELKAN 


6750 


3 


42B 


SCESRRPGAIG^VWASGALPRDTTGIiGSEQPSGDVAQSNRATMGT 
TAPG P I H LLELCDQKLMEFLCHMDNKDLVWLEE I QEEAERMFTR 
E FS KE PELM PKT PSQKNRRKKRR I S YVQDENRDP I RRRLSRRKS 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGAS VKVAVRVRP FNS REMSRDSKC 1 t QMS GS~TTTI V 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEGYNVCIFAYGQTGAGKSYTMMGKQEKDQQGIIPQLCEDL 
FSRINDTTNDNMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVRE 
HPLLGP YVEDLS KLAVTS YNDIQDLMDSGNKARTVAATNMNETS 
SRSHAVFNI I FTQKRHDAETNITTEKVSKISLVDLAGSERADST 
GAKGTRLKEGANINKSLTTLGKVISALAEKDSGPNKNKKKICKTD 
FI P YRDS VLTWLLR ENLGGNS RTAMVAALSP ADI NYDETLS TLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRJDLLYAQGLGD 
I TDMTNALVGMS PS SSLSALS SRNV 


i 6752 


24 


1834 


RNCVP PLGC YRSRVKFHS D I KMQ YSHHCEHLLE RLNKQREAGFL " 

roCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 

VKADG FQKLLEF I YTGTLNLDS WNVKEIHQAAD YLKVEE WTKC 

KIKMEDFAFIANPSSTEISSITGNIELNQQTCLLTIjRjDYNNREK 

SEVSTDLIQANPKQGALAKKSSQTKKKKKAFNSPKTGQNKTVQY 

PSDILENASVELFLDANKLPTPWEQVAQINDNSELELTSWEN 

TPPAQDIVHTVTVKRKRGKSQPNCALKEHSMSNIASVKSPYEAE 

nsgeeldoryskakpmcntcgkvfseasslrrhmrihkgvkpyv 

CHLCGKAFTQCNQLKTHVRTHTGEKPYKCELCDKGFAQKCQLVF 
HSRMHHGEEKPYKCDVCNLQFATSSNLKIHARKHSGEKPYVCDR 
CGQRFAQASTLTYHVRRHTGE KP YVCDTCGKAFAVSSSL 1 THS R 
KHTGEKP F I CELCGNS YTD I KNLKKHKTKVHSG ADKTLDS SAED 
HTLS EQ D S I Q KS PL S ETMDVKPSDMTLPLALPLGTEDHHMLL P V 
TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


2 


1305 


VPS LPYP PQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEM EG VALKHGPSLPQERKQAPS TRRDSAE PSSSRSVPV 
AHPGPPPASSQTPAPEHDKAANKMPLAQKPALAPKPTSQTPPAS 
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1 SEQ 
ID 
| NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D-Aspartic Acid, E«= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I»Isoleucine, KoLysine, 
L^Leucine, M«Methionine, N=Asparagine , 
P=Proline, Q»Glutamine, R^Arginine, 
£> -serine, T=Tnreonine, Vu Valine, 
W=Tryptophan, Y^Tyrosine, X«Unlcnown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




*754 






¥L»s> Kij t> UP XLv QltLS RkAGKt'DPE P SE PS KBDQBSS DRRPPS P P" 

GPEERXGQKRDEEEEATERKPASPPLPATQQEKPSQTPEAGRKE 

KPMLQSRHSLDGSKLTEKVETAQPLWITLALQKQKGPREQQATR 

EER KQAREAKQAEKLS KENVSVS VQPGSSSVS RAGS LHKSTALP 

EEKRPETAVSRLERREQLKKANTLPTSVTVEISYSSPAAPLVKE 

VSKRFSSPDDAPVSSEPAWLALAKRKAKAWSDCPLIIK 




6755 


2 - 


413 


FVRRRRRRLGGPEVNTMSSLHKSRIADFQDVLKEPSlAIiEKLRE 

LSPSGIPCEGGLRCLCWKILLNYLPLERASWTSILAKQRELYAQ 

PLREMIIQPGIAKANWGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVLL 






298 


1343 


pglolovaleadwfldmp^6rrgpsrqqlsrsaLpslqtlvggg 

CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LLFEFLFFIYLLVALF1QYINIYKTVWWYPYNHPASCTSLNFHL 
IDYHIAAFITVMLARRLVWALISEATKAGAASMIHYMVLISARL 
VLLTLCG WVLCWTLVNLFRSHS VLNLLFLGYP FG VY VPLCC FHQ 
DSRAHLLLTDYNYWQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQFNNATP I PTHSCPLS PDLIRNE VECLKADFNHRIKEVLFNS 
LFSAYWAFLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 




6757 


180 


754 


IKRALGSLPLS I PVSWGSLRTLKYQQQPLRPKVLLCQTRVQCHD - ' 
LRSLQPQPPGLKQSFCLRVLGU3TGATTPGLRDLTCKELIILTE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 
KAL YWD VMLENYRNLVFLGKDNFALE VKI CP RVFL YFLCCLS W E 
PFHYLTETEALLTHK 


6758 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLAlVtMi.LFSHLSAVQ 
TRGI KHR I KWNRKALPSTAQI TEAQVAENRPGAFI KQGRKLDID 
FGAEGNR YYEANY WQ FPDG I HYNGCS EANVTKE AFVTGC I NATO 
AANQGE FQKPDNKLHQQVL W 




1 


1008 


ASGPELPGRRFRDRAPWLPARLLRGVLAVWVSLSAJW5PGSFCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPSFRRNMANNS PALTGNS QPQHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMILTN ILSS P YFKVQLYELK 
TYHE WDE I Y FKVTHVE P WE KGS RKTAGQTGMCGG VRGVGTGG I 
VSTAFCLLY KLFTLKLTR KQ VMGL ITHTDS P YI RALGFMY IR YT 
QPPTDLWDWFESFLDDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRIPVPVQKNIDQQIKTRPRKI 


6759 
6760 


1 


513 


RKHNFHSLDGTSTRAFHPQTGLPLI^SPVPQRKTQSGCFDLDSS 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFE BS VLN YRFDPLG I VDGFTAE VGAS GAFCPTHLTL P VE VS FY 
S VSDDNAPS PYMG VI TLESLGKRG YRVPPSGTIQWCVL 




239 


606 


VLSKKKGLSAEEKRTRMMEIFSETKDVFQLKDLEKIAPKEKGIT 
AMS VKE VLQS LVDDGMVDCER IGTSNYYWAFPS KALHAR KHKLE 
VLESQLSEGSQKHASLQKSIEKAKIGRCBTEERT 




29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCP5 - 
S S S SVQRCE LS LFQSLHTMTS K K L VNS VAGCADDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAG AVFTS PAVGS ILAAIRAVAQAGTVGTLLI VKNYTGD 
RLN FGLAR EQARAEG I P VEMW I GDDSAFTVLKKAGRRG LCGT V 
L I HKVAGALAEAGVG LEE I AKQ VNVVTKAMGTLGVS LS S CS VPG 
S K PTFELSADEVELGLG IHGEAGVRR I KMATADE I VKLMLDHMT 
NTTNASHVP VQPGSSVVMMVNNLGGLSFLELGI IADATVRSLEG 
RG VKIARALVGTFMS ALEMPG I S LTLLLVDE PLLKL I DAE TTAA 
AWPNVAAVS I TGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
ERVCSTLI^LEEHLNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PPPASPAQLLSiaSVLLLEKMGGSSGALYGLFLTAAAQPLICAKT 
SL PAWSAAMDAGLEAMQ KYG KAAPGDRTMLDS LWAAGQBL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
bo first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
IMUstidine, I=Isoleucine, K=Lysine, 
* d - i -"^v^v*j. jig , pj-pjcLnionine, N=Asparagine, 
PeProline, CMSlutamine, R=Arginine, 
S*Serine, TnThreonine, V=Valine, 
j W*Tryptophan, Y=Tyrosine, X=Unknown, *-stop 
• Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6752 


3 


613 


QVA FI TLAVAAGLYYLAEL IEB Y TVATS RI I KYMI WFS TAVL IG 
LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 

c i~i Hi c i i err oil V unlr JL r krJjW X J. tr c Air Jv SLiSA 

GBNVL PSTMQPGDDWSN YFTKGKRGK 


6763 




760 


SGPDFPGRRFRGCCCVRPPAGAGMEIiGGHWDMNSAPRLVSETAE 
RKQEQKTGTBABAADSGAVGARRFLLCLYLGGFLDLFGVSMWP 
LLS LH VKS LG AS PTVAG I VGS S YG I LQLFS STLVGCWS DWGRR 
o*9uunwiuiiaiiiiMi uiuwAnin vf JjtM/JjARVPAGIFKHTIjSISH 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYI.TELEDGF 
YLTAF I CFLVF I LNAGL WJ FFPRREAKPGSTE 


6764" 


80 


438 


LKKMDTMMLSVRNLFEQLVRRVEILSEGNEVQFIQLAKDFEDFR 
KKWQRTDHELGKYKDLI^IKAETERSALDVKLKHARNQVDVEIKR 
RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 
6"^6- 1 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLliQFPSGPSRHFLAACVARWL 
RG5 VLVSEALSGSAKDGI VTEVAVGVKRGSDELLSGSVLSS PNS ! 
NMS SM WTANGNDS K K FKG E D KMDGAPSRVLH IRKLPGEVTETE j 
VIALGLPFGIWrNILMLKGKNOAFLEl^TEEAAITNGNY^SAVT ' 




1 


1287 


eggsfkasltwlwplgemklhcevevisrhlpalglrnrgkgvpT - ' 
avlslcqqtsrsqppvrafllistlkdkrgtryelrenieqfft 

KFVDEGKATVRLXEPPVDICLSKANSSSLKGFLSAMRLAHRGCN 
VTJTPVSTLTPVKTSEF^FKTKMVITSKKDYPLSKNFPYSLEHL 
QTSYCX5LVT?VDMRMLC1jKSLRKLDIjSHNHIKKLPATIGDLIHLQ 

elnijjjdnhlbsfsvalchstlqkslwsldlsknkikalpvqfcq 
lqelknlkijjdneliqfpckigqlinuiflsaarnklpflpsef 

RNLSLEYLDLFGNTFEQPKVLPVIKLQAPLTLLESSARTILHNR 
IPYGSHIIPraLCQDLDTAKICVCGRFCLNSFIQGTTl>^HSV 
AHTWLVDNLQQTEAP I ISYFCSLGCYVWSSDI 


6767 


336 


919 


APMI CLCSSLJLiQFRYKEAFLRDRGLQlGYCS VDDDPRMKHFLNV - " 
w«\jjv/iL3iy«c* j x\jvL»r rtJVoKoyr noSTDQPGLLiQAKRSQQIjASDVHY 
RQPLPQPTCDPEQLGLRHAQKAHQLQSDVKYKSDLKLTRGVGWT 
PPGSYKVEMARRAAEIJ^ARGIiGLQGAYRGAEAVEAGDHQSGEV 
NPDATE I LHVKKKKALLL 


6768 i . 
6769 


2 
284 


363 
3 96 


PGS TI SC YLLSEGSLPLCMQ VACGEEKHRAPTMKTLRAR FKKTE ' ' 
LRLSPTDLGSCPPCGPCP I PKPAARGRRQSQDWGKS DERLLQAV 
ENNDAPRVAALIARKGLVPTKLDPEGKSAFHL 


6770 " 


1 


397 


MSTPD FS TAENNQELANE VS CIjKAM LTLMLQAMGQAD " 

WKNYQVIWS^TMAiaHDYYKDEVVKK^ 

I TLNMG VGEAIADKKLLDNAAADLAAI S GQ KPL I TKARKS VAGF 

KIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAKS 


6771 


3 


3 70 


APAGTLAM TGKS VKDVDR YQA VLANIiLLEEDNKFCADCQS KG PR ' 
WAS WN IG VFIC I RCAG IHRNIX5VH I SRVKSVNLDQWTQEQ I QCM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6772 


1 


1406 


AAAFLQGMT VNGF INT VI TS L \ERR YDt»HS YQSGL a ASS YDIAA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P+*GWKLDAGVRTCPANPR\PVCAG\HTSGLSRYQLVFMLGQFL 
HGVGATPLYTLGVTYLDENVKSSCSPIYIAIFYTAAILGPAAGY 
LIGGALLNIYTEMGRRTELTTESPLWVGAWWVGFLGSGAAAFFT 
AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGKT 
IRDLPLSIWLLLKNPTFILLCLAGATEATIilTGMSTFSPKFLES 
QFSLSASEAATLFGYLWPAGGGGTFLGGFFVNKLRLRGSAVIK 
FCLFCTWS LLG ILVFSLHCPS VPMAG VTAS YGGSLLPEGHLNL 
TAPCNAACSCQPEHYSPVCGSDGLMYFSLCHAGCPAATETNVDG 
QKVYRDCS CIPQNLSSGFGHATAGKCTST 
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SEQ 
ID 

NO: 


Predicted — 
beg inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
u-ijeucine, wonecnionine, NvAsparagine, 
P=Proline, Q=Glutaraine, R^Arginine, 

S=Se ririf» T=Thronni no t/_t r_ i j _ _ 
w i — laiBuuine , v»vaiin6 t 

W-Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 

Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 


6773 


1 


630 


PWEAPKEHK^K'A>?RHT\A/TiTWTVlT?Pr^UFT3l?rv\/L?n/^r~\riV^<^^f7T^ — 
nwiM nnaoni w vjji v ivjCirUn r^r r UYHKQLYHKCTHKG 

RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 

CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRT 

EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEGHRL 

CHCPVGYTG?PCDVGE*GSGASRRPAPRWDGLAR 


6774 


j 146 


389 


LTELSDQQ Y ?LF FI LS S / WVPTFLSMDVDGR VI KADS FS KI IS S " 
GLRIGFLTGPKPLIERVILKIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


TCPSQLRVLtaRGGRRAPSPQIiWTLVI^ALIEEKWRSHRILRMNS 
GRPETMENLPALYTIFQGEVAMVTDYGAFI KI PGCRKQGLVHRT 
HNS S CRVDK PS EI VDVG DKVWVKL IGREMKNDR I KVS LS MKWN 
QGTGKDLDPNNV\SLSKKRGGGDPSRITLGRRSPLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLLH 
IjNGT FPNTSDADME PC VDGWVYDRI S FSS T I VTEWDL VCDS QS L 
; TSVAKFVFMAGMMVGGILGGHLSDRFGRRFVLRWCYLQVAIVGT 
CAALAP-rFLIYCSLRFLSGIAAMSLlTNTIMLIAEWATHRFQAM 
GITLGMCPSGIAFMTLAGLAFAIRDWHILQLWSVPYFVIPLTS 
SWLLESARWLIINNKPEEGLKELRKAAHRSGMKNARDTLTLEIL 
KSTMXKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFyA 
YFGL>n^G/LKHLGNNVFLLQTLPGAV/TPPGQLVLHLGHWGSG 
R VS S RGR VNCLGL FVLQ VW 


Sill 


779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGIISCtAFSPAQPLYACGSY 
GRS LGLY AWDDGS PLALLGGHQGG I THLCFHPDGNR FFS GAR KD 
AELLCWDLRQSGYPLWSLGREVTTNQRIYFDIiDPTGQFLVSGST 
SGAVSVWDTDGPGNDGKPEPVLSFLPQKDCTNGVSLHPSLPLLG 
*«.bPV5VCFIjSPTESGGRRRGAGPSLGSPRRHVHIiECRLQLWWC 
GGGARLQHP* *SPRARKGR 


6778 


311 


805 


IQS I TDESRGS I RRKNPANTRLRLNVP\BBTAGDSE / ERS PEEK 
VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
iu«u> V ABKW \KGP \S PVbS EG I KDFFSMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRKRNGDENY 


6779 


2 


535 


RALRRQPRLLAANG I E PESMA I SEPIKGSRKP CVNKE E LALKKP 
MAKCAWKGPREPPQDARAEAESPGGASESDQDGGHESPPKKKAV 
AW VS AKN PAPMR KKKKVS LGPVS YVLVDS EDGR KKPVM PKKGPG 

SRREASDQKAPRGQQPAEAX7ASTSRGPKAKPEGSPRRATNESRK 
V 


6780 


3 


403 


HEVNDNKPEININLWSPGKEEISYI?EfeDPIDTFVAtVRVQDkt) " 

SGLNGEIVCKLHGHGHFKLQKTYENNYLILTNATLDREKRSEYS 

LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRYEFVISE 


6781 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPBLSEVS 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITrSANT 
SAALPTHLQSALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPVFINSSS 1 1 Q VMKfSS Q PS XI PAAPLTTNSGLM P PSVAWG PL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKELNPDEASPQVNTSADQNTLPSSQ 
STTMVSPLLTNSPGSSGNRRSPVSSSKGKGKVDKIGOILLTKAC 
KKVTGSLEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTPTPPAPTLLKMTSSP VGPGTASAGPS LPGGALPTSVRS I VTT 
LVPSELISAVPTTKSNHGGIASESLAG 


6782 


3 


1327 


KKPTviRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSXYMLSVPHGIANEDIVSQ 
NPGELSCKRGDVLVMLKQTENNYLECQKGEDTGRVHLSQMKLIT 
PLDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLKDFPAEQV 
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(A«Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, VaValine, 
W-Tryptophan, Y=»Tyrosine, X«Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








DDLNLTSGEIVYLLEKIDTDWYRGNCRNQIGIFPANYVKVIIDI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELSFSEGEIII 
LKEYVNEEWARGEVRGRTG I FPLNFVBPVEDYPTSGANVISTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


SYHHHHAQQSAAAS PNLTASQKTVTTTSM ITTKTLPLVLKAATA 
TMPAS WGQRPT I AMVTAINS QKAVLS TDVQNTPVNLQTSS KV"r 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPKPVAQNNIPIAPAPPPMIAAPQLIQRPVMLTKFTPTTL 
PTSQN S 1 H P VRWNGQTAT I AKTFPMAQLTS I VI ATPGTRLAG P 
QTVQLSKPSLBKQTVKSHTETDEKQTESRTITPPAAPKPKRBBN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTP TS PQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCS RVYHLDCLDPPLKT I PKGMWI CP ROQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNSI S KCMEMKNT I LARQKEMHS SLE KVKQL I RL I H 
G IDLS KP VDS EATVG AI S NG PDCTP PANAATSTPAP S PSSQS CT 
ANCNQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPLVLKAATA" 
TM PAS WGQR PTI AMVTAINSQKAVLS TDVQNTP VNLOTSS KVT 
GPGAEAVQI VAKNTVTLQ VQATP PQ P I KVPQF I P P PRLTPRPN F 
LPQVRPKPVAQNNIPIAPAPPPMLiAAPQLIQRPVMLTKFTPTTL 
PTSQNS I HPVRWNGQTATI AKTFPMAQLTS I VI ATPGTRLAG P 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERXRRTTANPVYSGAVPE 
PERKKS AVTYLNS TMHPGTR KRGR P PK YNAVLGFGALTPTS PQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKT I PKGMWI CPRCQDQMLK 
KESAI PWPGTLAI VHS YIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQK VKQLSNS I S KCMEMKNTI LARQKEMHS SLE KV KQL I RL IH 
G IDLS KP VDS EATVGAISNGPDCTPPANAATSTP APSPSSQS CT 
ANCNQGEETK 


" 6785 


1 


528 


LGNTVLHYCSMYSKPECLKLLLRSKPTVDIVNQAGETALDIAKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
K PS P VKKERS PR PQS FCHSS SIS PQDKLALPGFS TPRDKQRLS Y 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVLAPTRELANHVSRDFKDI \TRKLTVARFYGGTSYQSQ~~ 

I NH I RNG ID I LVGTPGRI KDHLQSGRLDLS KLRHWLDEVDQML 

DLGFAEQVEDI IHES YKTDS EDNPQTLL FS ATCPQWVYTVA\ KK 

YMKSRYEQVDLDGKMTQKAATTVEHLAIQCHWSQRPAVIGDVLQ 

VYSGSBGRAIIFCETKKNVTEMAMNPHIKQNAQCLHGDIAQSQR 

EITLKGFREGSFKVLVATNVAARGLDIPEVDLVIQSSPPQDVES 

Y IHRSGRTGRAGRTG I CI CF YQPRERGQLRYVEQKAGITFKRVG 

VPSTMDLVKSKSMDAIRSLASVSYAAVDFFRPSAQRLIEEKGAV 

DALAAALAHISGASSFEPRSLITSDKGFVTMTLESLEEIQDVSC 

aw xsjLiJW krlio aWA v & y 1 1 km CL LKGNMG VCFD VPTTES ERLQAE 

WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 

RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 

FD* VF YHLVDFLSDFLVDSVYLTGRQ I DHLTGLTGL I DHLTSHS 

SVWN 


6787 
" *788 


2646 


2270 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFF 
FELGSPSGVISAHCNLRLLGSSDSPAPASRVAGIIGTCHHAWLI 

lvflvbmgfhhvgqaglklltl\vihppwppkvlglqt 




16 ' - 


936 


ggtvdlr\dmlavsviaavrggr/atvrrvresnvlhekskgkt 
regaedkmtsgdvlsnrkmfyllktafpsvqinteehvd\eldq 
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P=Proline, Q=Glutamine, R*Arginine, 
S=Serine, T*Threonine , V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\«=posaible nucleotide insertion) 








E V I LV?GS * DS * G Y PKGK* LLP KE VPS R / RVLLSGLT PLDATQE \ 
FTEDLS K\ YVTTMVC VAVNGKPMLGV I HKP FSE YTAWAM VDGGS 
NV KARS SYNEKTPRI WS RS HS GMVKQ VALQT FGNQTT I 1 PAGG 
AG YKVLALLD VPDKSQEKADL YIHVT Y I KKWD I CAGNAI LKALG 
GHMTTLS G EE 1 S YTGSDG I EGGLLAS I RMNHQAIjVRKLPDLE KT 
GHK 1 


6789 


2 


678 


^ginvlkiapesaikf^yeqikrlvw**pgds*gf/y^vaH 

GS LAGAI AQS S I YPME VLKTRMALRKTGQ YSGM LD CAR R I LAR E 
GVAAFYKGYVPNMLG 1 1 P YAG I DLAVY ETLKNAWLQHYAVNS AD 
PGVFVLLACGTMSSTCGQLASYPLALVRTRMQAQAS IEGAPEVT 
MSSLFKHILRTEGAFGLYRGLAPNFMKVI PAVS IS YWYENLKI 
TLGVQSR 


6790 
6791 


2 


4068 


appagrrrmqaaprag^aalllwivssclcrawtapstsqkcdH 

EPLVSGLPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQISAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KPYHQDGN I WAF PGNINS DG WRHELQHP I IARYVR I VPLDWNG 

EGRIGLRIEVYGCSYWADVINFDGHVVLPYRFRNKKMKTLKDVI 

ALNFKT S E SEG V I LHGEGQ QGD Y I TLELKKAKL VLS LNLGSNQL 

GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMQHFR 

TNGEFDYLDLDYEITFGGIPFSGKPSSSSRKNFKGCMESINYNG 

VN I TDLARRKKLE P SNVGNLS FS C VE P YTVP VFFNATS YLEVPG 

RLNQDLFSVSFQFRTWNPNGLLVFSMFADNLGNVEIDLTESKVG 

VHINITQTKMSQIDISSGSGLNDGQWHEVRFLAKENFAILTIDG 

DEASAVRTNSPLQVKTGEKYFFGGFLNQMNNSSHSVLQPSFQGC 

MQLIQVDDQLVNLYEVAQRKPGSFAMVSIDMCA1IDRCVPNHCE 

HGGKCS QTWDS FKCTCDETGYSGATCHNS I YEPSCEAYKHLGQT 

SNYYWIDPDGSGPLGPLKVYCNMTEDKVWTIVSHDLQMQTPWG 

YNPEKYS VTQLVYSASMDQI SAI TDSAE YCEQYVS YFCKMS RLL 

NTPIX5SPYTWWVGKANEKHYYWGGSGPGIQKCACG1ERNCTDPK 

YYCNCDADYKQWRKDAGFLSYKDHLPVSQVWGDTDRQGSEAKL 

SVG PLRCQGDRN YWNAAS FPNPS S YLHFS TFQGETSADI S FYFK 

TLTPWGVFLENMGKEDFIKLELKSATE VS FS FDVGNGPVEI WR 

SPTPLNDDQWHRVTAERNVKQASLQVDRLPQQIRKAPTEGHTRL 

ELYSQLFVGGAGGQQGFLGCZRSLRMNGVTLDLEERAKVTSGFI 

SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCNKD 

VGA FFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNS H PDLAQ 

EEIRFSFSTTKAPCILLYISSFTTDFLAVLVKPTGSLQIRYNLG 

GTREWYtf I D VDHRNMANGQPHS VNI TRHEKTI FLKLDHYPS VS Y 

HLPS SS DTL FNS P KS LFLGKVI ETGKIDQE I HKYNTPG FTGCLS 

RVQFNQIAPLKAALRQTNASAHVHIQGELVESNCGASPLTLSPM 

SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAriGGVI 

A\W1FTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 

KPGRRPSMNNDPPTSQRP IDESKKEWPHLRGG YIAKG | 


<S792 


1801 


1193 


TGHSGAKGEKGDKGDLGPRGERGQHGPKGBKGYPGIPPErTPGlH 
SAW* SWLTAASTKVQAILL PQP LE * LGLQI AFMAS LATHFSNQ 
NSGI I FSS VETNIGNFFDVMTGRFGAPVSGVYFFTFSMM KHEDV 
EEVYVYLMHNGNTVFSMYSYEMKGKSDTSSNHAVLKLAXGDEVW 
LRMGNGALHGDHQR FS TFAGFLL FBTK 




33 


1073 


VRHTNWGVDMVLPSLGSESPKGAIGHlVSTEKTILAVERiJKVlsH 
P PL WNRTFS WG FDDF S CCLGS YGSDKVLMTFENLAA WGR CLCAV 

CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRLRC2ALYGHTQAV 
TCLAASVTFSLLVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 
ITISDVSGTIVSCAGAHLSLWNVNGQPLASITTAWGPEGAITCC 
CLMEGPAWDTSQIIITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
AVSRNHTKLLVGDERGR I FCWSADG* EERGSRGSGTTVPG 
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Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E=» 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, Va Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 


6793 


2340 


805 


GRKEANY\YGSLTQAGTVSLGLDAEGQEVFVPFSAVLPMVAPND 
LVFDGVroiSSLNLAEAMRRAKVLDWGLQEQLWPHMEALRPRPSV 
YIPEFIAANQSARADNLIPGSRAQQLEQIRRDIRDFRSSAGLDX 
VIVLWTANTERFCEVIPGLNDTAENLLRTIBLGLEVSPSTLFAV 
AS I LEGCAFLNGSPQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKS VLVDFL I GS GLKTMS I VSYNHLGNNDGENLS APLQ FRS KEV 
SK5NWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRALDE 
YTS ELMLGGTNTLVLHNTCE DS LLAAP IMLDLALLTEL CQR VS F 
CTDMDPEPQTFHPVLSLJUSFLFKAPLVPPGSPVVNALFRQRSCI 
EN I LRACVGLPPQNHMLLEH KM E R P GPSLKRVGP VAATY PMLNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRLFLPAAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6 794 


169 


1349 


DDVKRKPEASAH * EKPGPPSRPGVRGGRERAGGRGSHGARSCR\ 
EPAP PAPAPPEDHPDEEMGFTID I KS FLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRLFERYGEPSEVFINRDRGFGFIRLESRTLAE 
I AKAELDGTI LKSRPLR IRFATHGAALTVKNLS P WSNELLEQA 
FSQFGP VEKAVVVVDDRGRATGKGFVEFAAKPPAR KALE RCGOG 
AFLLTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
RFAQ PGTFE FE YAS RWKALDEMEKQQREQVDRNI REAKE KLEAE 
ME AARHEHQLM LMRQDLMRRQEELRRLEELRNQE LQKRKQ X QLR 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR 


$79$ 


1740 


1010 


GPRRQTQVRDIIELDSF*DWAAQETDCAQNSGERL*KGV/liENFS 
TMSKSAVKISLDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNQ 
EKVNQI QKTVIE PL KKFGS VF PS LNMAVKRREQALQDYRR LQ AK 
VEKYEEKEKTGPVLAKLHQAREELRPVREDFEAKNRQLLEEMPR 
FYGSRLDY FQPS FE S L I RAQ WYYS EMHKIFGDLS HQLDQPGHfl 
DEQRERENEAKLSELRALS IVADD 


6796 


48 


683 


GKEIQI PTI KLAWLLFGLE * PVGALGKGWS P+ * SHVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA*SSPSPVGACPSLNPPET 
S VQEGRDCWQR* LPRLFSALVGQPGCWPQGAPPERCV* PGRCKW 
HLQSQVLR* ERRRCCRCLPR FA * GWRRRHQRLGLG I HPAPLGST 
S PPHP EGNS QQ CR R * GWAAELRLPS S WL *GKLG C * 


6797 


1620 


211 


termtpsqptrgssctrfssmlwtstWrcltchwagmrmswgv 
tlgpmaqgllsasgttteatwtrptthltlirwwlltasrvdpp 
erpppppsddltllessssyknl/daqipq/dwsmspstsg+rp 
ltsrass imrsrtai psas *srlttkhtvggspsawrprptsrs 
vstpvssstettasgscltwwssspapcpsssapahsfeascck 
tslwgs cggsgdgssacgsgwnlsmagtscsspamcspsraps * 
rsasrprtwrattsaasswaprrcwcgwa*sat*psstttisss 
phcgwpcpascas aaawlsstwatas vagscwgp im * ss ahs pw 
clsacsrssmgttcl+rspp\sgasraaaawcgsspsstftpss 

ASSSTWCSASSSRSSPAPTTPSSIPAAQAQRRASCRPTSHSART 
APPPASSAAGAARPAAFSAAAEGTPRRS IRCW 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS"~ 
ALMGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLIiVPQ\PQIA 
VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKV I ELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLE EVNNNVR LLSEMLLH YS QEDS S DGDRE LMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
OVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSLSSVLA 
PAPTPPSSGI PILPPP PQASGPPRSRSSSQAEATLGPSSTSNAL 
S WLDEELLCLGLAOP APNVPP KES AGNSQHHLLQREQS DLDFFS 
PRPGTAACGAS DAPLLQPS APSS SSSQAPLPPPFPAP WPAS VP 
APSAGSSLFSTGVAPAIAPKVEPAVPGHHGLALGNSALHHLDAL 
D QLLEE AKVTS GLVKPTTS PLI PTTTPAR P LL P FSTG PGS PLFQ 
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Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine. D=Aspartic Acid, E=* 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, NeAsparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, VeValine, 
{^Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLS FQS QGS PPKGPELSLAS IHVPLES I KPSS ALPVTAYDKNGF 
RILFHFAKECPPGRPDVLVWVSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLLANPbKEKVRLRYK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6199 


3894 


1696 


STI SWBS LESWLNKATNPSNRQEDWE YI IGFCDQINKELEG * V3 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLLAHKIQS PQEWE ALQALTYLGDRVS EKVKTKV I E LL YS WTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTL1PSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
OVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSI/SSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
S WLDE ELLCLGLAD P APNVP PKES AGNSQWHLLQR EQSDLDF FS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APS AGSS LFSTGVAPALAP K VE PAVPGHHGLALGNS A LHH LDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
PLSFQSQGS PPKGPELSLAS IHVPLES I KPSSALPVTAYDKNGF 
R I LFHFAKECPPGRPDVLWWSMLNTAPLP VKS I VLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLLANPLKEKVRLRYK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6800 


404 


1646 


RRSPSTGLSPVPQ.PSSPSLSDYSIPWSLLLSGTIAWATPGK*AG 
* PQAW* LGLAPAIAFI /GLTRGRKQNKEKMAEGGSGDVDDAGDC 
S GAR YNDWSDDDDDS NES KS I VW YPPWAR IGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 
AALI ALGNNAAYAFNRDI IRDLGGLPI VAKI LNTRDP I VKEKAL 
IVIiNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLL 
TNMTVTNEYQHMLANS ISDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVIFENINDN 
FKWEENEPTQNOFGEGSLFFFLKEFQVCADKVLGIESHHDFIjVK 
VKVGKFMAKLAEHMFPKSQE 


" ttoi 


2 


1755 


SAEEFESQQASVTMHDVDAESFEVLVDYCYtGRVSLSEANVERL 
YAASDMLQLEYVREACASFLARRLDLTNCTAILKFADAFGHRKL 
RSQAQS Y I AQNFKQLSHMGS I REETLADLTIiAQLLAVLRLDSLD 
VESEG^TVCHVAVQWLEAAPKERGPSAAEVFKCVRWMHFTEEDQD 
YLEGLLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/R * QQQLS C I CSRKSTPETGYVCQGDGDLLWTPQRSLS \RYDP Y 
S GD I YTM PS PLTSFAHTKTVTS S AVCVS PDHD I YLAAQPRKDLW 
VYJCPAQNS WQQLADRLLCREGMDVA YLWG YI YILGGRDPITG VK 
LKEVECYS VQRNQWALVAPVPHS FYSFEL I WQNYLYAVNSKRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDEIYCICDIPVMKVYN 
PARGEWRR I SNI PLDS ETHNYQ I VNHDQKLLL ITSTT PQ W KKNR 
VTVYE YDTREDQW I Nl GTMIX3LLQ FDSGFI CLCARVYP S CLE PG 

QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


6802 


157 


1*41 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECA^" 

PSTRKNLMNSLEQKIRCLEKQRKELLEVNQQWDQQFRSMKELYE 

RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 

RDRLQREEKEKERLNEElJreLKEENIujLKGKNTLANKEKEHYEC 

E I KRLN KALQDALN I KCS FS E DCLRKS RVBFCHEEMRTEM E VLK 

QQVQIYEEDFKKERSDRERLNQEKEELQQ2NETSQSQLNRLNSQ 

IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 

KQREHPPD YQW YALD QL P PD VQHKAN/ DWCLAP PPVCCQAG/ PR 

TP GLK* SS CLWLPKC * N FR FI LS KE S PS VE VHTNRERQQATRER 

G 


' 6803"" 


1 


2203 


KLSGRPYRHMGVLGTSKLYDIRKTI FTFTPQFIDQQQFYLALDN ' 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D=Aspartic Acid, H= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y*Tyroaine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTSLNS 
S I LAALRKMQDG YPGGAHVQTGKLSEFLTTSCCTHLS FMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCX5DEVARYLDHLL 
AHTAPH P KLAPTS QKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHM YLPTKL FQAS R PS FNLLDS PHPRQENQ VPS VR VE IHLPRD 
OS GEVD FKALVLOLKETSSLQE OAD I L YMLYTMKG PDWNTELYN 
ERS ATVRELLTELYGKVGE IRHWGLIRYI SG ILRKKVEALDEAC 
TDLLSH0KHIiTVGIJPPEPREKTISAPLPYEALTQLIDBASEGD^5 
SISILTQEIMVYLAMYMRTQPGIjPAEMFRLRIGLIIQVMATELA 
HSLRCSAE E ATEG LMNLS PSAMKNLLHH I LS GKEFGVE RK/ S VR 
PTDSNVSPAISIHEIGAVGATKTERTGIMQLKSEIKQVEFRRLS 
ISAESQSPGTSMTPSSGSFPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVP VGFYQKVWKVLQKCHGLS VEGFVLPSSTTREMTPGE I K 
FS VHVE S \ VLNVLLRPE YRQLLVEAI LVLTMLAD I E IHS IGS 1 1 
AVEKI VHIANDLFLQEQKTLGP \DDTMLAKDPASG\ I CTLR\ YD 
SAPSGRFGTMTYLS\RAA\ATYVQEFLP\HS icamq 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLE EKRKSLRTTG FYSG FS E VAEKR I KLLNNSDERLQN S RAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQTVAEEESC3PSVELEKPPPVNVDSKPIEEKTVEVNDRKAEFP 
SSGSNFSA*IPLPYLHLNRLHQSL*QKGSRQQSSVTVSEPLAPN 
QBE VRS IKSETDSTIEVDS VAGELQDLQSERE * LASRF * CQCEL 
KQ * * SARTRTS * KSLYRSEKSERCSGRRKFI KKAEKKP * SNSGK 
QQKEGKRHK 


6805 


1539 


206 


« \i*ruu R. x tf\j iwb r V V b V a IS b b o UUS W u U f KFAOBXKARNRNQNYXi 
VPSPVLRILDHTAFSTEK6ADIVICDEECDSPESVNQQTQEESP 
I E VHTAEDVP I AVE VHAI S ED YDI ETENNSS E S LQDQTDE E P PA 
KLCKILDKSQALNVTAQQKWPLLRANSSGLYKCEIiCEFNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLLIEHAKLHEEDPYI 
CKYCDYKTVIFENLSQHIADTHFSDHLYWCEQCDVQPSSSSELY 
LHFQEHSCDEQYLCQFCEHETNDPEDLHSHVVNEHACKLIELSD 
KYNNGEHGQYSLLSKITFDKCKNFFVCQVCGFRSRLHTNVNRHV 
AIEHTKIFPHVCDDCGKGFSSMLE\IAKHLNSHLSEGIYLCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERBLISHLP 
VHETT 


6806 


272 


3794 


VALCFPNSDPVMFMDAFYGCLIiAELGPVPIEVPL.TRKDAGSQQV^' 
GFhLGS CGVFIALTTDACQKGLPKAQTGE VAAFKI3 WPPL S WLVI 
DGKHLAKPPKDWHPIAQDTGTGTAYIEYKTSKEGSTVGVTVSHA 
SLLAQCRALTQACGYSEAETLTNVLDFKRDAGLWHGVLTSVMNR 
MHWS VPYALMKANPLS W I Q KVCFY KARAAL V KS RDMHWS LLAQ 
RGQRDVSLSSLRMLXVADGANPWS I SSCDAFLNVFQSRGLRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TBEKLSVLTVQDVGQVMPGANVCWKLEGTPYLCKTDEVGEICV 
SSSATGTAYYGLLGITKNVFEAVPVTTGGAPIFDRPFTRTGLLG 
F IG PDHLVF I VGKLDGLM VTGVRRH NADD WATALAVE PM K FVY 
RGR I AVFSVTVLHDDR I VLVAEQRPDASEEDS FQWMSRVLQAID 
SIHQVGVYCLALVPANTLPKAPLGGIHISETKQRFLEGTLHPCN 
VLMCPHTCVTNLPKPRQKQPEVGPASMIVGNLVAGKRIAQA5GR 
E1JUILEDSDQARKFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
STATCVQLH KRAERVAAALMEKGRL S VGDHVALVYPPGVDL I AA 
FYGCLYCGCVPVTVRPPHPQNIXSTTLPTVN^IVETVSKSACVLTT 
QAVTRLLRSKEAAAA VDIRTWPT I LDTDDI P KKK I AS VFRPPSP 
DVLAYLDFSVSTTG I LAG VKMSHAATSALCRS I KLQCELYPSRQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVS Q YKARVTFCCYS VMEMCTKGLGAQTGVLRMKG VNLS CVRTC 
MWAEERP\RIALTQSFSKLFKDLGLPARAVSTTFGCRVNVAIC 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hlstidine, I=Isoleucine, K=Lysine, 
L=Leucine, [^Methionine, N*Asparagine , 
P=Proline, Q«Glut amine, R=Arginine, 
S=Serine, T*Threonine, VoValine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) ! 








LQGTAGPDPTTVYVDMRALRHDRVRLVERGSPHSLPLMESGKIL 
PGVKVIIAHTETKGPLGDSHLGEIWVSSPHNATGYYTVYGEEAL 
HADH FS ARLS FGDTQT I WARTG YLG FLRRTELTDASGGRHDAL Y 
WGSLDETLELRGMRYHPIDIETS VIRAHRS IAECAVFTWTNLL 
VVVVELDGLEQDALDLVALVTNVVLEEHYLWGVVVI VDPGVI P 
INS RGEKQRMHLRDGFLADQLDP I YVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSPPVNVTVSPRSEESHTTTVSGGNG 
. S VFQAGPQLQALANLEARRGS IGAALSSRDVSGLPVYAQSGE PR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGFCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 
RFPCGMEVHSGQRELESWAVGEAMA\LKFPMGAMSYCLRDRSR 
FLFRLPMGLSCPLQVQ 


S808 


2063 

• 


737 


GVGSGAASALARSRPLASRLSSRRRTRAPRSGAMQRLAMDLRML 
SRELS LYLEHQVRVGF FGSGVGLSL ILGFS VAYAFYYLSS IAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPF\ 
I TS KP P VQYRNEL I KTADGGQ ISLDWFDNDNSTCYMDAS TRPT I 
LLLPGLTGTSKESY1LHMIHLSEELGYRCWFNNRGVAGENLLT 
PRTYCCANTEDLETVIIlHVHSLYPSAPFIiAAGVSMGGMLLLNYL 
GKIGSKTPLMAAATFSVGWNTFACSESLEKPLNWLLFNYYLTTC 
LQSS VNKHRKMFVKQVDMDHVMKAKS IREFDKRFTSVMFGYQTI 
DD YYTDASPS PRLKSVG IPVLCLNS VDDVFS PSHA1 P I ETAKQN 
PNVALVLTS YGGH IG FLEG I WPRQSTYM DRVFKQ FVQAM VEHGH 
ELS 


6803 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQP LHPSDPTE KQQ PKRLHVSNI PFR FRDPDLRQM F 
GQFGKIIiDVEI I FNERGS KG FGFVTFETSSDADRAREKIiNGTI V 
EGRKIEVNNATAR VMTNKKTGNP YTNG WKLNP VVGAVYG PE FYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPI PTYG 
AWYQDGFYGAE I \ LEATQ PTDTLS PLQRRQ PTATVTAES TQLP 
TRT I TPSG PRRP TALE PCETFHRFLLGP 


6810 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGKILDVEI I FNERGS KGFGFVTFETSSDADRAREKLNGTI V 
EGRKI E VNNATAR VMTNKKTGNP YTNG WKLNP VVGAVYGPE FYA 
VTG FP YPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPP IPTYG 
AWYQDG FYGAE I \ LEATQ PTDTLS PLQRRQPTATVTAES TQLP 
TRTI TP SGPRRPTALE PCETFHRFLLGP 


6811 


1*22 


658 


DLVTVWSFVDCRVIASTHGH\KSWVSWAFDPYTTSVEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 
S VGQDTQLCLWD LTED I L FPHQPLSRARTHTNVMNATS PPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDPAKTLGTPLCPRMBDVPLLEPLICKKIAHERLTVLIFLEDCI 
VTACQEG FICTWGRPGKWS FNP 


6812 


4001 


1682 

» 


EDAVFSLDLSTI IQGTWFLNGEELKSNEPEG£ VE PGALR YRIEQ 
K(jLUHKLILIiAVKHQDSGALVGFSCPGVQDSAALTlQESPVHIL 
SPQDKVSLTFTTSERWLTCELSRVDFPATWYKDGQKVEESELL 
WKMDGRKHRL1LPEAKVQDSGEFECRTEGVSAFFGVXVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFVVLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSS W I VY PSGKVYVAAVRLERVVLTCELCRPV7AEVRWTKDGE 
EWESPALLLQKEDTVRRLVLPAVQLEDSGEYLCE I DDESAS FT 
VTVTBPPVR I IYPRDEVTLI AVTLECWLMCELSREDAPVRWYK 
DGLEVEESEALVLERDGPRCRLVIjPAAQPEDGGEFVCDAGDDSA 
FFTVTVTEPPVQFLALETTPSPLCVAPGEPVVLSCELSRAGAPV 



556 



WO 01/53312 



PCT/OS00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G«Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ToThreonine, VoValine, 
w=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
Vpossible nucleotide insertion) 








WSHNGRPVQEGEGLELHAEGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPS LS FTVQVAEPP VRWAP E AAQTRVRS TPGGDLE L WH 
LSGPGGPVRWYKDGERLASQGRVQLEQAGARQVLRVQGARSGjDA 
GEYLCDAPQDSRIFLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEVS P PDADVTWLRNGAWTPGPQRQS CCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 


S S TQQRPG VPAGPRPLDG YLGVADH K PLKMHCRD CAL VTS SGHL 
LHSRQGSQIDQTECVIRMNDAPTRGYGRDVGNRTSLRVIAHSSI 
QRILRNRHDLLNVSCX3TVFIFWGP5SYI^RDGKX3QVY2^HLIiS 
QVLPRLXAFM I TRHKM LQ FDELFKQETGQ\NRKI SNTWLS TGWP 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPBS 
IiAINHPENKPVF 


! 6814 


3 


737 


KFRRQEAN/ARERNRMHGliNDALDNLRKWPCYS KTQKLSKIET 
LRLAKMY I WALS E ILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRSP YSTFYPPYHS PELTTPPGHG 
TLDNSKSMKPYNYCSAYESFYESTSPECASPQFEGPLSPPPINY 
NG I FSLKQEETLDYGKNYN YGMHYCAVPPRGPLGQGAMFRLPTD 
SHFPYDLHLRSQSLTMQDELNAVFHN 


$815 




553 


QGLDPASQTKVVELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRSSEMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 


1 


803 


NLLKTHKF \LLGQDEDSLHS VPVAQtaGNYQE YLKTLAS PLREl D 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSSKRRRSMSLLLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI KPTLVHTDATI IHDGHEEKMENGQ ITPDGFLS KSAPSE LINM 
TGDLMPPNQVDSLSDDFTSLS KDGL I QKPGSNAFVGGAKNCS LS 
VDDQKDPVASTLGAMPNTLQITPAMAQGINADI KHQLMKEVRKF 
GRSK 


6817 


172 


34*7 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DEYCPACKEKGKLKALKTYRISFQES I FLCEDLQCI YPLGSKSL 
NNL I S PDLEECHTPHKPQKRKSLESS YKDSLLLANSKKTRNYI A 
I DGGKVLNS KHNGE VYDETS SNL P DS SGQQNP I RTADS L ERNE I 
LEADTVDMATTKDPATVDVSGTGRPSPQNEGCTSKLEMPIjESKC 
TSFPQALCVQWKNAYALCWIiDCILSALVHSEELKNTVTGLCSKE 
ES IFWRLLTKYNQANTLLYTSQLSGVKDGDCKKLTSEI FAE IET 
CLNEVRDEI FISLQPQLRCTLGDMES P VFAFPLLLKLETHI EKL 
FLYS FS WDFECSQCGHQYQNRHMKSLVTPTNVI PEWHPLNAAH F 
GPCNNCNS KSQ I R KM VLEKVS P I FMLH F VEGLP QNDLQHYAFHF 
EGCLYQ I TS VIQ YRANNHF I TW I LDADGS WLE CDDLKGPCS ERH 
KKFEVPASEIHIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 
KPVSLTSCS VGDAASAETAS VTHPKD I S VAPRTLSQDTAVTHGD 
HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 
AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
WNTNMQSVQLNTEDTVNTKSVNNTDATGLIQGVKSVEIEKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASKVSKKARKSASKPPPISKPPAGPP 
S SNGTAAHPHAHAAS EVLEK SGST S CG AQLNHS S YGNG I S S ANH 
EDLVEGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRTVRSENLE 
QVPQDGS PNDCES IEDLLNELPYP IDIANESACTTVPGVSLYSS 
QTHE B I LAELLS PTP VSTELS ENGEGDFR YLG MGDSH I P PPVPS 
EFNDVSQNTHLRQDHNYCS PTKKNPCEVQPDS LTNNACVRTLNL 
ESPMKTDIFDEFFSSSALNAliANDTLDLPHFDEYLFENY 


6818 


2 


240 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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Amino acid segment containing signal pepticfe - "" 
(A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K»Lysine, 
LaLeucine, MsMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, ReArginine, 
S=Serine, T=Threonine, Vc Valine, 
W=Tryptophan, YaTyrosine, X=Unknown , *=Stop 
Codon, /=«possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGGE/LHS/ATTEHKP/VQATPVNLT\TILTSTWQARLPQI " " 


6819 


1 


961 


GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSLEICIKACKNLAY 
GEEKKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQL VTRQLQ VS VWHLGTLARRVFLGEVT I PLATWDFED S 
TTQSFRWHPLRAKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTDQPSLHGQLCLVVLGAKNLPVRP0GTLNSFVKGCLTLP 
DQQKLRLKS PVLRKQACPQWKHS FVFSGVTPAQLRQSSLELTVW 
DQALFGMNDRLLGGT\RLGSKGDTAVGGDACSQSKLQWQKVLSS 
PNLWTDMTLVLH 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLKVVRKHHRVIA 
GQ FFGHHHTDS FRML YDDAGVP I SAMFITPGVTPWKTTLPGWN 
GANNP A IR VFE YDRATLSUCDMVTYFMNLS Q ANAQGTPRWELE Y 
QLTEAYG VPDASAHS MHTVLDR I AGDQS TLQRY YVYNS VS YS AG 
VCDEACS MQHVCAMRQVD I DAYTTCL YASGTTP VPQLPLLLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVriP^QSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAF S LI EG Y I \ S I VMD AETQKKFPSDLLLTS S SGE LWRM VR IG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVLTLDPETLPA I ATTL I DVL FYS HSTPKEAASSS P E PS S I T 
FFAFS LIEG Y I \ S I VMDAETQKKFPS DLLLTS SSGE LWRM VR I G 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6823 


6" 54 


221 


PPKLLSRWARMGHGDBIV\LSDLNFPGLLHLPWGPWRSVQTAC " 
GIPQLLEAVLKLLPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YES ILRRAGCVRAIiAKI ERFEFYERAKKAFAWATGETAL YGNL 
ILRKGVLALNPLL 


6824 


8S8 


104 


LLLAQR WGWG \ CC FFSLA VS VKMNVLLFAPGLLFLLLTQ FGFRG ' ' 
ALPKLGICAGLQWLGLPFLLENPSGYLSRSFDLGRQFLFHWTV 
NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGES I LS 
LLRDPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 
WYFHTLPYLLWAMPARWLTHLLRLLVLGLIELSWNTYPSTSCSS 
AALHI CHAVILLQLWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILI ILCSLMEPWALGACTFVHLL"' 
PKFDPLVILKTLSSYPIKSMMGAP1VYRMLLQQDLSSYKFPHLQ 
NCLAGGES LLP ET LENWRAQTGLD I RE FYGQTETGLTCMVS KTM 
KI KPG YMGTAAS CYDVQI IDDKGNVL PPGTEGD IG IR VKP IRP I 
GIFSGYVDNPDKTAANIRGDFWLLGDRGIKDEDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILALQFLSHD P EQLTKBLQQHVKS VTAPYKYPR K I EFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 
PFGPLALPMDGYGDSLWEEHEYKFCLALVISTKLYHVRC 


6826 


2304 


954 


LKTESFKPW/VNIALAFHLLGERASPNSFWQPYIQTLPREYDTP 
L YFEEDE VRYLQS TQ AIHDVFSQYKNTARQ YAYF YKVI QTH PHA 
NKLPL KDS FTYBD YRWAVSS VMTRQNQ I PTEDGS RVTLAL I PLW 
DMCNHTNGL ITTG YNLEDDR CEC VALQD FRAG EQ I YI FYGTRS N 
AEFVIHSGFFFDMNSHDRVKIKLGVSKSDRLYAMKAEVLARAGI 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
RI FTLGNS EFPVS WDNEVKLWTFLEDRASLLLKTYKTTIEEDKS 
VLKNHDLSVRAKMAIKLRIXSEKEILEKAVKSAAVNREYYRQQME 
EKAPLP KY EESNLGLLESS VGDSRLPLVLRNLE EEAG VQDALN I 
REAI S KAXATENGL VKGEKS I PNGTRS ENES LNQES KRAVEDAK 
GSS SDS TAG VKE 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D»Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H»Histidine, I°Isoleucine, K=Lysine, 
L=«Leucine, M=Methionine, N=Asparagine, 
P^Proline, QeGlutamine, R=Arginine 
s= Serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, XaUnknown, *=Stop 
Codon, /npossible nucleotide deletion, 
\epossible nucleotide insertion) 




1 


779 


SS WE FGI*SVIX3GLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL 
ETRNLDPBNGSGMALQPLQAAPEPGAQGQREKNSQHPPALAPPG 
HQGHSHGHQGGTD 2 TWMVLLGDGLHNLTDGIiAIGAAFSDGFS SG 
LSTTLAVFCHELPHELGDFAMLLQSGLSFRRLLLLSLVSGALGL 
GGAVLG VGLS LG P VP LTPWVFGVTAGVFLYVALVEMLPALFPS S 
GAPAyA\HVLLQGLGLLLGGCLMLAITLLEERLLPVTTEG 


6628 


3 


1654 


KSQHG/WILQLMHSCKEGYVKDLKGNPGLHRAMLDLDNGTRPSE 
LGHLSQTASLKRGSS FQSGRDDTWRYKTPHRVAFVEKLTKLVLS 
Uurn r wiujW ± to x vnbbljr 5>JS i AISKouQl ERS KNVRQRQNDPKKM 
IQEVMHSLVKLTRGALIiPLSIRDGEAKQYGGMEVKCELSGQWLA 
HAIQTVRLTHESLTALEIPNDLLQTIQDLILDLRVRCVMATLQH 
TAEEIKRLAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLKGVLE 
CKPGEASVFQQPKTQEEVCQLSINIMQVFIYCLEQLSTKPDADI 
DTTHLSVDVSS PDLFGS IHEDFSLTSEQRLLI VLSNCCYLERHT 
r LiiM i AUHr c JSJiNFQG I E KITQVSMASLKELDQRLFENYI E LKAD 
PIVGSLEPGIYAGYFDWKDCLPPTGVRNYLKEALVNI IAVHAEV 
FTISKELVPRVLSKVIEAVSEELSRLMQCVSSFSKNGALQARLK 
I CALRDTVAVYLT PES KS S FKQALEALPQLS SGADKKLLEELLN 
: KFKS SMHLQLT CFQAAS STMMKT 


6829 


1 


782 t 


MRMEAGEAAPPAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTLCR 
EQKS FLSRLCQGEELQSDRDETGAYLIDRDPT YFGPI LNFLRHG 
KL VLOKDMAEEGVLE EAE FYNI GPL I RI I KDRMEEKD YTVTQ VP 
flUl V i KVJjyuQBEEIiTQMvSTMSIXjWRFEQLWIGSSYWYGSED 
QAEFLCVVSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 
EE VEVEQVQVEADAQ E K/ CCYK PE APGCEAPDHLQGLG V P I 


6830 


1 


939 


MEPGSVENLSIVYRSRDFLWNKHWDVRIDSKAWRETL^KQL 
RYRFPELADPDTCYG FRFCHQLDFSTSGALCVALNKAAAGSAYR 
CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 
EG SQGCEN P K PS LTDL WLEI 1G L YAGDP VS KVLL KPLTGRTHQL 
RV\HCSALGHPWGDLTYGEVSGREDRPFRMMLHAFYLRIPTDT 
EC VE VCTPDP FLPS LDACWS PHTLLQSLDQLVQALRAT PD PDPE 
uiwj jr lurua r OAUuiru ruKF Jrfr fcTK r PETEAQRGPGLQWLSEWT 
LEPDS 


6831 


3 


1087 " 


SLFFGSSTPDNKVAEQEDLETQPSPSVEJKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKPISEVSREDYGKKElSGDSEEMNINSWTSADGENli 
EIQSYSLIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
SIFKEEPRSDQKQKSLLSFDWDKVPQQPKSASSNFASKNITKE 
S E KP ES I IL P VEES KGS L I DFSEDRLKKEMQNPTS LKISEEET K 
LRSVSPTEKKDNIiENR\SYTL\AEKKVLAEKQNSV\APLELRDS 
NE IGKTQITLGS RS TE LKES KADAM PQHFYQNEDYNER P KI I VG 
SEKEKDEKKKK 


" 6"832 


1809 


412 


* ll >^ v "^wva*'^\^iyriovjii/iijj\jit'ej</iyiirttoljJk'Nir AtioQIiI?"ifEYLLV 
VS LK KKRSEDDYE P I IT YQ FPKRENLLRGQQEEE ERL LKAX PLF 
CFPDGNEWASLTEYPRETFSFVLTNVDGSRKIGYCRRIiLPAGPG 
PRLPKVYCIISCrGCFGLFSKILDEVEKRHQISMAVIYPPMQGti 
REAAFPAPGKTVTLKSFI P DSGl'KF I S LTRPLDSHLEHVDFSSL 
LHCLSFEQILQ1FASAVLERKIIFLAEGLSTLSQCIHAAAALLY 
P FS WAHT YI P WPESLLATVCCPTP FMVGVQMRFQQE VMD S PM2 
EVLLVNLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINELKT 
AEQINEHVSGPFVQFFVKIVGHYASYIKREANGQGHFQERSFCK 
ALTS KTNRRFVKKF VKTQL FS LFIQEAE KS KNP PAG YFQQKI L E 
YEEQ KKQ / TETKGKNCE I RAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGLPKGPttVKS^RPGS^DIN " 
VAPGEQGPDQEBTNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
QVNGNLVRE PDHMELEEDPJVGQLNMRGVFLHVLGDALGSVI VVV 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AaAlanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S»Serine, TeThreonine, V«Valine, 
W»Tryptophan, Y-Tyrosine, X-Unknown, +«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








NALVFYFSWKGCSEGDFCVNPCFPDPCKAFVSIINSTHASVYEA 
GPCWVLYLD PTLC WM VC I LLYTT YPLLKESAL I LtiQTVPKQI D 
IRNLIKELRNVEGVEEVHELHVWQIAGSRIIATAHIKCEDPTSY 
MEVAKTI KDVFHNHG I HATT IQ PE FAS VG S KS S W PCE LACRTQ 
CALKQCCGTLPQAPSGKDAEKTPAVS I SCIiELSNNLEKKPRRTK 
AENIPA\WIEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRLLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCGSSASAYGWH* RLTPWSPGGS * HM* SS KAPVTQARE VLVAG P 
CS KL VLSG ARG I VGTT VQ VLVEAQQ PLLLLFTG VWG LNLRAGEE 
S RAL * L I EE VTQ VRDAHLGNAWG CAQCLSQG Q VGSALAKALLE 
AAAAVRDCKEVLTVSGDKQQAEVSVRL*VRDVCVEEAGCYEFGQ 
AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGE DQAARTRLLQAGAHS VAHGRRQGQAP CRPHQEAG VS CHE 

LQQWGDAL+ARE*APQIIVLLLLEDVAQLRTGKKA*DLWDVE 
QLLRQL 


6835 


1 


834 


GIPAADR\EASLELIKLDISRTFPNtCIFQQGGPYriDMLHSIljG 
AYTC YRPDVG YVQGMS F I AAVL I LNLDTADAF I AFSNLLNKPCQ 
MAFFRVDHGLMLTYFAAFEVFFEENLPKLPAHFKKNNLTPDIYL 
I DW1 FTLYSKSLPLDLACRI WDVFCRDGEEFLFRTALGI LKLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAL 


6836 


1 


850 


MSCGRPPPDVDGMITLKV^DNLTYRTSPDSLRRVFfikyGRVGDV " 

YI PREP.HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 

QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 

RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 

PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 

SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 

KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


tdgaavagnpgsdyfpggtap/ggprtrrp\sg! , sssgs1CS5gp~ 

PNPPAQGDGTSLSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPG VS PGQQQASGAAVGGS S AG ET 
RGAPTPHE KALTS PS WG KGAELLLGDQPDL IGS LDGGAKS DS S S 
PNVGEFASDE VS TSYANEDEVSS SS DNPQALVKASRS PLVTGS P 
KLPPRGVGAGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDS ELGSCCS EAVKS AMSTI 
DLDS LMAEHSAAWYMPADKALVDSADDDKTLAPWEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPrWR 
S LKS DI SNRFGTFVAALT 


6838 


16 


499 


LTDTPPPKTHMIHHSISDYKATLRCWALGFYPMEITLTWQQDEE 
DQTRDMELVETRPAGDGTFQKWAAVWPSGEE/Q/RYMCHVQHE 
GLPEPLTLRWEQSSQPT IP I VGI VAGLVLLGAWTGAWSAVMC 
RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


6839 
4840 " 


1 


1195 


AAPAGGG PD PEAIjS AFPGRHLSGLS W PQ VKRLDALLSEP I P I HG "" 
RGNFPTLSVQPRQIRAGGPQHPGGAG \ IHVHRVRLHGSAASHVL 
HPESGLGYKDLDLVFRMDLRSEASFQIiTKAWliACLLDFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVEIiK 
FVDSVRRQFEFSIDS FQI ILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVIATRS PEE I RGGGLIiKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AARR YAC L VTLHRWNESTVCLMNHE RRQTLD L I AALALQALAE 

QGPAATAAIAWRPPGTDGVVPATVNYYVTPVQPLLAHAYPTWLP 
CN 




4254 


2061 


ELQGDFS VPDVPKSMAWCENS ICVGFKRDYYLIRVDGKGS IKEL 
FPTGKQIjE PLVAPLADGKVAVGQDDLTWLNEEG I CTQKCALNW 
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cor re spond i ng 

to first 

amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, RsArginine, 
S=Serine, T«Threonine, V* valine, 
WoTryptophan, Y«Tyrosine, X=Un known, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TDIPVAMEHQPPYIIAVLPRYVEIRTFEPRLLVQSIEL^RPRFI 
TSGGSNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFEIALQLA 
EMKDDSDSEKQQQIHHIKNLYAPNLPCQKRFDESMQVFAKLGTD 
PTHVMGLYPDLLPTDYRKQLQYPNPLPVLSGAELEKAHIiALIDY 
LTQKRSQLVKKLNDSDHQSSTSPLMBGTPTIKSKKKLLQIIDTT 
LLKCYLHTNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
LY E KKGLH EKALQVL VDQS K KANS P LKGHE RTVQ Y LQH LG TE NL 
HLI FS YSVWVLRDFPEDGLKI FTEDLPE VES L PRDR VLGFL I EN 
FKGLAI PYLKHI IHVWEETGSRFHNCLIQLYCSKVQGLMKEYLL 

sfpagktpvpageeeqelgeyrokllmpleissyydpgrlicdf 

PFDGLLEERALLLGRMGKHEQALFIYVHILKDTRMABEYCHKHY 

drnki^kdvyi^llrmylsppsihclgpiklellepkanlqaa 
lqvlelhhskldttkalnllpantqindiriflekvleenaqkk 

RFNQVLKNLLHAEFLRVXQEERILHQQVKCI ITEEKVCMVCKKK 

ignsafarypngwvhyfcs\kevnpadt 


6841 


1 


3206 


tpsttgtksntptssvpsaa^pln^LqpWdygvgsknskra 
rekrdsrnmevqvtqemrnvsigmgssdewsdvqdiidstpeld 
mcpetrldrtgssptqgivnkafgintdslyhelstagsevigd 
vdegtu^llgefsgmgkevgnlllensqllbtknalnvvkndlia 
kvdqlsgeqevlrgeleaakqakvklenrikeleeelkrvksea 
iiarrepkeeaedvssylctesdkipmaqrrrftrvemarvlme 
rnqykerlmelqeavrwtemirasrehpsvqekkkstiwqffsr 
lfssssspppakrpypsgnihykspttagfsqrrnhamcpisag 
srpleffpdddctssarreqkreqyrqvrehvrnddgrlqacgw 
slpakykqlspnggqedtrmknvpvpvycrplvekdptmklwca 
agvnlsgwrpneddagngvkpapgrdpltcdregdgepksahts 
pekkkakelpemdatssrvwiltstlttskwi idanqpgtwd 
qftvcnahvlcissipaasdsdyppgemfldsdvnpedpgadgv 

IAGI TIjVGCATRCNVPRSNCSSRGDTPVIiDKGQGEVATIANGKV 

npsqsteeateatevpdpgpsepetatlrpgpltehvftdpapt 
pssgpqpgsengpepdssstrpepepsgdptgagssaaptmwlg 

AQNGW L Y VHS AVANWKKCLHS I KLKDS VLSLVHVKGR VLVALAD 

GTLAI fhrgedgqwdlsnyhlmdlghphhsircmawydrvwcg 
yknkvhviqpktmqieksfdahprresqvrqlawigdgvwvsir 

LDSTLRLYHAHTHQHLQDVDIEPYVSKMLGTGiCLGFSFVRITAL 
LVAGSRLWVGTGNG WI S I PLTETWLHRGQA LXiG \ LRANKTS P 
TSGEG\ARPGG\IIHVYG\DDSSDRAARSFIPYCSMAQAQLCFH 
GHRDAVKFFVSVPGNVLATJjNGSVLDS PAEGPGPAAPASEVEGQ 

KLRNVLVLSGGEGYIDFRIGDGEDDETEEGAGDMSQVKPVLSKA 
ERSHI I VWQVS YTPE 




6842 
~6043 


3 


926 


KCUgLSATILTDHQYI^RTPI^ILKQKAPQQYRIRAmSYKP 
RRLFQSVKLHCPKCHLLQEVPHEGDLDIIFQDGATKTPDVKLQN 
TSLYOS KIWTTKNQKGRKVAVHFVKNNGILPLSNE CLLL I EGGT 
LSE ICKLSNKFN S VI PVRSGHEDLELLDLSAPFLI QGTVHHYGC 
KQWS T * RS I QNLNSLVDKTS W I P SS VAE ALG I VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQ I PASEVLMDDDLQKS VDMIMDMFC 
rrvj±is.x uax rWljfcCFIKSYNVTNGTDNQlCY QI FDTTVAEDVI 






2 


851 


WHRKVLSGAKRYECNECGKSFAYTSSLIKHRRIHTGERPYECSE 
CGRSFAENSSLIKHLRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSLIIHIjRVHTGERPYECSDCXSKSFAENSSLlKHLRVHTGE 
RPYECIDCGKSFRHSSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 
LLLCX5QRVHTGSRC YECDKWG I FFS *NAS FFT* KSAPTEEVP FE 
CNECEKAFS PliSLVTTl FT 




6844 


244 


642 


KHQIAGFELRICTOTSMSLGTTREKTDRVKSTAYLSPQELEDVFY" 
QYDVKSEIYSFGIVLWEIATGDIPFOGCNSEKXRKLVAVKRQQE 
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Predicted end 
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amino acid 
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Amino acid segment containing signal peptide ' 
{A«Alanine, C=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
LsLeucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
WaTryptophan, Y^Tyrosine, X=Unknown/ +»stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








PLGEDCPSBLREI I DECRAHDPS VR PSVDEILKKIjSTFS K* CI K ' 
I 


6845 


3 

• 


1519 


VAVRDECYWRkVFWDQDLW^FILMCKPETARARLE?RTRTLD " " 
GAL EN AQNLG YQGAKFAWESAD SGLE VCPE D I YGVQEVHVNGAV 
GLAFELYYHTTQDLQLFREAGGWDWRAVABFWCSRVEWSPREE 
KYHLRGVMSPDBYHSGVNNSVYTNVLVQNSLRFAAALAQDLGLP 
I PSQWLAVADKIKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 
Y P VP FS LS PDVRRKNIjE I YEAVTS PQGPAMTWS MFAVGWM ELKD 
AVRARG LLDRS FANMAE P FKVWT E N ADG SGAVNFLTGMGG FLQA 
WFGCTGFRVTRAGVTFDPVCLSGISRVSVSGIFYQGNKLNFSF 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSSEFPGRTFSDVRDPLQSPLWVTLGSSSP 
TESLTVDPASE*SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFLKTIK*LNRLAEHP*YENEKLTKLRNTIMEQYTRTEESARG 
I I FTKTRQSAYALSQ W I TENEK FAE VG VKAHHL I GAGHS SE FKP 
MTQNEQKB V I S KFRTG KI NLLXATT VAEEGLDI KECN I VI R YGL 
VTNE I AM VQARGRARADES TYVLVAHSGSGV I EHET VND FREKM 
MYKAIHCVQNMKPEEYAHKILELQMQSIMEKKMKTKRNIAKHYK 
NNPS LITFLCKNCS VLACSGEDIHVI EKMHHVNMTPEFKELYI V 
RENKTLQKKCAD YQ INGE I I CKCGQ AWGTMMVHKGLDL P CLKIR 
NFVWFKNNSTKKQYKKWVELPITFPNLDYSECCLFSDED 


6847 


14 50 


348 


SMCWNSDRLEMPLIDLAIjILYPPSYVPYTGHLSDDSL6RKYCLT 
WFEDALNGVL*RAEAIQPHCVNAGDRMEKFRQKYWNKLQTLRQQ 
PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 
PGWRSLDALGWEERQLAJLVKGLLAGNVFDWGAKAVSAVLESDP 
YFGFEEAKRKLQERPWLVDSY5EWLQRLKGPPHKCALIFADNSG 
IDI I LGVFPFVRELLLRGTE VIIACNSGPALN0VTHSES L I VAE 
R I AGMD P WHS ALREERLLLVQTGS S S PCLDLS RLDKGLAALVR 
ERGADLWIEGMGRAVHTNYHAAIiRCESLKIiAVI KNAWLAERLG 
GRLFSVI FKYEVPAE 


6848 


19 


16 


AIVIWWNSIJDGIRNIVLSNPKKRNTLSLAMLKSLQSDlLHDADSND " 
LKVIIISAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
I RNHP VP V I AM VNGLATAAGCQL VAS CDI AVAS DXSS FAT PGVN 
VGL FCS TPG VALARAVPRKVAtiEMLFTGEP I SAQEALLHGLLNK 
WPEAELQEETMRrARKIASIiSRPWSLGKATFYKQtiPQDLGTA 
YYLTS Q AMVDNLALRDGQEG I TAFLQKRKPVWSHEPV* VEH 


6849 


70 


821 


SLGVr>GSCLKQGSPAPRPQTDTSP*PVGWWATQQEDLYriQSYEC~~ 
VCVLFASVPDFKE FYSESNINHEGLECLRLLNE I IADFDELLS K 
PKFSG VEK I KT I GS TYMAATGLNATS GQDAQQDAERSCSHLGTM 
VEFAVALGSKLDVINKHSFNNFRLRVGLNHGPWAGVIGAQKPQ 
YDIWGNTVNVASRMESTGVLGKIQVTEETAWALQSLGYTCYSRG 
VIKVKGKGQLCTYFLNTDLTRTGPPSATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQELHLFMLSGVPDAVFDLTD - 
LDVIi K t*ELI PEAKI PAKIS QMTNLQE LHL CHC PAKVEQTAFS FL 
RDHLRCLHVKFTDVAEIPAWVYLLKNLRELYLIGNLNSENNKMI 

KLLVLNSLKKMMNVAELELQNCELER I PHAI FSLSNLQELDLKS 
NNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESL 
YFSNNKLESLPVAVFSLQKLRCXDVSYNNISMIPIEIGLLQNLQ 
HLHITGNKVD1LPKQLFKCIKLRTLNLGQNCITSLPEKVGQLSQ 
LTQLELRGNCLDRLPAQLGQCRMLKKSGLWEDHLFDTLPLEVK 
EALNQD IN I P FANG I 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSIjEEAEDCYPPSLLTLD 
LRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGLMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence » 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine. 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








MARPWTEDGDWTEPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE ' 
HKNTWSAQNCKNGSCVLDLSKCLFIQGKIjLFAEPKDAGFPFSQD 
INS H LAS LS MARNTS PTPDPTVREALCAPDNLNAS I ES QGQIKM 
YINEVCKKTVSRCCNS PLQQAGLNLLISMTVINNMLAKSASDLK 
FPLISEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFSFMSL 
F I RNGNRE I L LETPAP 


6852 


1 


407 


RTRGEETYANFIKHNDGKNIFYAARTPATLFAVMFAMYTlSGLT 
GFIGLNSIAVLCNLVMGliALIFLCTWAYVKYSGEFREIGTVIDQ 
IAETLWEQVLKPLGDNLMEENIRQSVTNSIKAGLTDQVSHHARL 
KTD 


5853 


3 


469 


GDSCAVCIELYKPNDLVRILTCN&I FriKTCVDPKfLLEHRTCPMC ' 
KCD IXjKALG IE VDVEDGS VS LQVPVSNE I FNS ASSHEEDNRSET 
ASSG YASVQG TYEP PLEEHVQSTNE S LQLVNHEANS VAVD VI PH 
VDNPTFEEDETPNQETAVREIKS 


6854 


1148 


585 


HES Y I GTFD PGELCVGAA I Q WLQDNS AS YFLNRKLVYE PS TQAK 
P VKNT FLRMW I YSHHI YQQDLRKK I LD VG KRLD VTG FCMTG KPG 
1 1 C VEGFKEHCEE FWH T I RY PNWKH I SCKHAES VETEGNGEDLR 
LFHS FEELLLEAHGDYGLRNDYHMNLGQFLEFLKKHKS EHVFQ I 
LFGIESKSSDS 


6855 


1913 


1148 


GR VGGRVG R I CS P LSGANE Y IAS TDTLKTEE VLIjFTDQTDDLAK 
E EPTS LFQRDS ETKGE SGLVLEGDKE IKQI FEDLDKKLALAS R F 
YIPEGCIQRWAAEMVVAtDALHREGIVCRDLNPNNILLNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITBETEACDWWSL 
G AVIj FE LLTG KTLVECH PAG I NTHTTLNMPEWVS EE ARS L I QQL 
LQFNPLERLGAGVAGVEDI KSHPFFTPVDWAELMR 


6 856 


1617 


• 997 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR 
TVYRLTLVKAWNVDELQAYAQLVSLGNPDFIEVKGVTYCGESSA 
S S LTMAHVP WHEE WQ FVR ELVDLI P EYE IACEHEHSNCLLI AH 
RKF K IGGE WWT WINYNRFQEL IQE YEDSGGS KTFS AKD YMART P 
HWALFGASERG FD PKDTRHQRKN KS KAISGC 


6857 


1 


517 


KGPEATAMVCVCSHPNCRQNH JKpSriSAAQTWCGS PTPASAPNH " 
KLMAMEQGKTLPSATEDAKEEGLEAQISRLAELIGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQI1SEFLDSLRQYI1RGT 
TG VRNC FH I TAVRLSDGF T FVI YEFWETEEAWKRHLQS PL CXAF 
RHVKVDTLSQPEALSRILVPAAWCTVGRD 


6858 


•2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA " 
LRC VQTAKLI LEELKLE KK I KI RVE PG I FEWTKW E AGKTTP TLM 
SLEEIiKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
IVNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECGDFAQLVR 

KIPSLGMCFCEENKEEGKWELVNPPVKTLTHGANAAFNWRNWIS 
GN 


6859 
6860 "* 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDIIQSPSSTGLLKSG " 
KTNS VE S LPELLTS DSEGS YAGVGSPRDLQS PD FTTGFHSD K I E 
AKVKPYVNGTSPVYSREDLKPWEKSP IbKISAPQPI PSNRI DTT 
SSASWVAGSFSPVSPPWDLRTIMEIEESRQKCX5ATPKSHLGKT 
v stKa v itiiby R KM I ALTTKENNSGMNS MErVLFTPS KAPKPVN 
AWASS LHS VS S KS FRD FLLEEK KSVTSHSSGDHVKKVS FKG I EN 
SQAP KI VRCS THGTPGPEGNHI S DLPLLDSPNPWLS SS VTAPS M 
VAPVTFASIVEEELQQEAALIRSREKPLALIQIEEHAIQDLLVF 
YEAFGNPEEFVIVERTPQGPLAVPMWNKHGC 




1889 


1515 


DKDK KRQKKRG I FPKVATN I MRAWLFQHLTH PYPS EEQKKQLAQ 
DTGLT I LQVNNWF INARRI I VQPMIDQSNRAVS QGAAYS PEGQ P 
MGS FVLDGQQHMG I RPAG PMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


D KD K KRQK KRG I FP KVATN I MRAWL FQH LTHP YPS E EQKKQ LAQ 
DTGLT I LQVNNW FINARR 1 1 VQ PM I DQSNRAVS QGAAYS PEGQ P 
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\=possible nucleotide insertion) 








MGS FVLDGQQHMG IRPAGPMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EEIDRBFHNKLKLKEDKLEKQBKPVNGEDKGDSGVDTQNSEGNA 
DE ED PLG PNCY YDKTKS FFDN I S CDDN RERR PTWAEE RRLNAET 
FGI P LRPNRGRGG YRGRGGLG FRGGRGRGGGRGGTFTAPRG FRG 
GFRGGRGGRE FAD FE YRKTTAFG P 


6863 


2216 


487 


PQ E PAL KS E FSQVASNT I P LP LPQPNTCKDNG PCKQ VCS TVGGS 
AI CS CF PG YAIMADGVS CEDQDE CLMG AHDCS RRQFCVNTLGS F 
YCVNHTVLCAIX3YILNAHRKCVDINECVTOLHTCSRGEHCVNTL 
GSFHCYKALTCEPGYALKDGECEDVDECAMGTHTCJQPGFLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSLSEPCRPGFSCI 
NTVGSYTCQRNPLICARGYHASDDGTKCVDVNECETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GSYQCYCRQGYQUVEDGHTCTDIDECAQGAGILCTFRCLNVPGS 
YQCACPEQG YTMTANGRS CKDVDECALGTHNCSEAETCHN IQGS 
FR CLRFECP PNYVQ V5 KTKCERTTCHDFLECQNS PAR I THYQLN 
FQTG LL VPAH I FRIG PAPAFTGDTI ALNI IKGNEEGY FGTRRLN 
AYTGVVYLQRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI FFT 
TFAL 


6864 


2 


2933 


LADSSPSNLQI I IKELLSMHHQPDPALTKEFDYLPPVDSRSSSG 
FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 
FYQVQSLFGHLMESKLQYYVPENFWKIFKMWNKELYVREQQDAY 
EFFTSLIDQMDEYLKKMGRDQI FKNTFQGIYSDQKI CKDCPHRY 
ERBEAFMALNLGVTSCQSLEISLDQFVRGEVLEGSNAYYGEKCK 
EKRITVKRTCI KSLPSVLVIHLMRFGFDWESGRS IKYDEQIRFP 
WMLNMEPYTVSGMARQDSSSEVGENGRSVDQGGGGSPRKKVALT 
EN YE LVGVI VHSGQAHAGH YYSFIKDRRGOGKGKW YKFNDTVI E 
EFDLNDETLE YECFGGEYR PKVYDQTNPYTDVRRRYWNAYMLFY 
QR VSDQNSP VLP JCKSRVS WRQEAEDLSLSAPSS P B I SPQ S S PR 
PHRPNNDRLSILTKLVKKGEKKGLFVEKMPARIYQMVRDENLKF 
MKNRDVYSSDYFSFVLSJ^LNATKLPOTYYPCr^KVSLQJLAIQ 
FLFQTYLRTKKKLRVDTEEWIATIEALLSKSFDACQWLVEYFIS 
S EGREL I KI FLLECNVRBVRVAVATI LEKTLDSALF YQDKLKSL 
HQLLE VLLALLDKDVPBNCKNCAQYFFLFNTFVQKQG I RAGDLL 
LRHS ALRHMI S FLLGASRQNNQ IRRWS S AQAREFGNLHNTVALL 
VLHSDVSSQRNVAPGIFKQRPPISIAPSSPLLPLHEEVEALLFM 
S EGKP YLLE VM FALRELTG SLLALI EMWYCCFCNE HFS FTMLH 
F I KNQLETAPPHELKNTFQLLHE I LVI EDPIQ VER VKF VFETEN 
GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYFKBNSHH 
WS WAVQWLQKKMSEHYWTLQSNVSNETSTGKTFQRT I SAQDTLA 
YATALLNEKEQSGS SNGSE S S PANENGDRHLQQGS ES PMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAADYNQALGTCRLAGTALCVAAGVLLAI CLFWAM IGWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQLSPIFRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


435 


D CPRPRYT LYGLRATCMRDLDWAW INAVSAFKAliEQDLP VNI KF 
I IEGMEEAGS VAL EEL VE KE KDRFFSGVDYI VI SDNLW I SQRXP 
AITYGTRGNSYFMVEVKCRDQDFHSGTFGGILHEPMADLVALLG 
SLVDSSGHILVPGIYDEWPLTEEEINTYKAIHLDLEEYRNSSR 
VE KFLFDTKEE I LMH LWRYPSLS IHG IEGAFDEPGTKT V I PGRV 
IGKFSIRLVPHMNVSAVEKQVTRHLEDVFSKRNSSNKMWSMTL 
GLHPWIANIDDTQYLAAKRAIRTVFGTEPDMIRDGSTIPIAKMF 
QEIVHKSWLI PLGAVDDGEHSQNEKINRWNYI EGTKLFAAFFL 
EMAQLH 
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Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6867 


2833 


1704 


GTRIMSQPKQKELAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP 

LQSAESSPTAGKKLPEVPPSEEEEQBAWVNALLGR1FWDFLGEK 

YWSDLVS KK I QMKLS K I KIiP YFMNBLTLTELDMGVAVPKILQAF 

KPYVDHQGLWI DLEMS YNGS FTjMTLETKMNLTKLGKEPLVEALK 

VGEIGKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 

LPGAEGYVGGHRTSKIMRPVDKITKSKYFQKATETEFIKKKIBE^ 

VSNTPLLLTVEVQECRGTLAVNIPPPPTDRVWYGFRKPPHVELK 

ARPKLGEREVTLVHVTDWIEKKLEQEFQKVFVMPNMDDVYITIM 

HSAMDPRSTSCLLKDPPVEAADQP 


6868 


1 


346 


R PTR P PTR~PE B t KNL IUYI SDMJJ F VQDIjCEDF YEliJF* iCttikb FD 

KATFESQMSVMRGQILNLTQAjURDGKSPFQLVQIPCVIVERSQG 
GSQGR I VHLSNS FTQTVNCRKPF FSS W 


6869 


3 


1619 


mymermdkralisfwesvehlknanknbipqlvgeiyqnffves 
keisvekslykeiqqclvgnkgievfykiqedvyetlkdryyps 

FIVSDIiYEKIiLIKEEEKHASQMISNKDEMGPRDEAGEEAVDDGT 

nqineqasfavnklrelnekleykrqalnsiqnapkpdkkivsk 
lkr>eiiliekertdlqlhmartdwwcenlg^5wkasitsgevtee 
ngeqlpcyfvmvslqevggvetknwtvpkrlsefhnlhrklsec 
vpslkkdqlpslsklpfks idhtfmekfenqlnkflqnllsder 
lcqsealyaflspspdylkvidvqgkknsfslssflerlprdff 
shqeeeteedsdlsdygddvdgrkdalaepcfmligeifelrgm 
fkwvrrtlialvqvtfgrtinkqirdtvswifseqmlvyyinif 
rdafwpngklappttirskeqsqetkqraqqkllenipdmlqsb 
vgqqnarhgi iki fnalqetrankhllyalmelliilelcpelrv 
hldqlkagqv 


6870 


1 


1566 


MAAWAATRWWQLLLVLSAAGMGASGAPQPPNILLLtiMDDMGWG 
DLGVYGEPSRETPNLDRMAAEGLLFPNFYSANPLC9PSRAALLT 
GRLP I RNG F YTTNAHARNA YTP QE I VGG I PDSBQLLPELL K KAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NIPVYRDWEMVGRYYEEFP INLKTGEANLTQIYLQEALDFIKRQ 
ARHHPFFLYWAVDATHAPVYASKPFLGTSQRGRYGDAVREIDDS 
IGKILELIjQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLC 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGS IMDLFTTSLAL 
AGLTPPSDRAIDGLNLLPTLLQGRLMDRPIFYYRGDTLMAATLG 
QHKAHFWTWTNSWENFRQGIDFCPGQNVSGVTTHNLEDHTKLPL 
IFHLGRDPGERFPLSFASAEYQEALSRITSWQQHQEALVPAQP 
QLNVCNWAVMNWAPPGCEKLGKCLTPPESIPKKCLWSH 


6871 


203 


1126 


RiVISLNPPIFLKRSEENSSKFVETKQSQTTSIASEDPLONteLAS 
QEVLQKAQQSGRSKCLKCGGSRMFYCYTCYVPVENVpiEQIPLV 
KLPLKIDXXKHPNETDGKSTAIHAKLLAPEFVNIYTYPCIPEYE 
EKDHEVALI FPGPQS I S I KD IS FHLQKR IQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKI I FIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
DILKEKYRGQYDNLLFFYSFMYQIilKNAKCSGDKETGKLTH 


6872 
6^873 - 


880 


459 


FGLLMWLS LI FMKGNCVREDL I FNF LP KIjGLD VR ETNGL FGNT 
KXLI TEyFVRQKYLE YRRI P YTEPAE YEFLWGPRAFLETSKMLV 
IjK * XjAIUjH KKD pqs W PFHYI*E alaece WEDTDEDE PDTGDS AHG 
PTSRPPPR 




1929 


955 


DEQAVLCSKDKTYDL.KIADTSNMLLFIPGCKTPDQLKKEDSHCN 
IIHTEIFGFSNNYWELRRRRPKLKKLKKLLMENPYEGPDSQKEK 
DS NSS KYTTEDL LD Q I QASEEE I MTQLQVLNACKIGG YWR I LE F 
DYEMKLLNHVTQLVDSESWSFGKVPLNTCLQELGPLEPEEMIEH 
CLKCYGKKYVDEGEVYFELDADKI CRAAARMLLQNAVKFNIJVEF 
QE VWQQS VPEGMVTS LDQL KGLAL VDRHS RPE I I FLLKVDDLP E 
DNQ ERFNS LFSLREKWTEED IAP Y I QDLCGE KQT I GALLTK YSH 
S3MQNG VKVYNSRR PIS 
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Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, KoLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine,. R=Arginine, 
S=Serine, ^Threonine, V= Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6874 


1 


307 


DS I ADHVNSAAVNVEEGTKNLGKAAKYKLAAiiPVAGALIGGMVG 
GPIGLLAGFKVAGIAAALGGGVLGETGGKLTQRXKQKMMEKLTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGSRGNS AS E KWE I MFNEELGDP F 1 1 1 HS I S LLNAE EHS I A 
T LLLR I EKBE LDMKGSG F YVSLE WVT I S KKNQDNKK YE 1 1 KRD I 
LRG KS VPH YAAI E P DGNGLM I VS YKS LTFVQAGQDLEENMDED I 
SEKIKEPLYYWQOTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VLKDHQFLEG KL YS5 1 DHE SSTW 1 1 KESNS LE I S LI KKN EGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEE CD I FFE ES S SLCR FDGNTLKTTHWNLGSNQ YLFS V 
rVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QASKRDKKFFACAPNYSYAALCECIiRRVFIYRQPAPMSTVLYNR 
KEGRQ VGQ VAKQQ VAS LETND P I LGFQATNBRLF VLTTKNL FL I 
KVNTEN 


6876 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS " 
LHTKPRMPPCDFMPERYQVIFLVNSGSEANEIAMLMARAHSNNI 
DI I S FRGAYHGCS PYTLGLTNVG I YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDS PVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 
VAKS IAGFFAE P IQGVNG WQYPKGFLKEAFELVRARGGVCIAN 
E VQTGFGRLGSHFWG FQTHDVLPDI VTMAKG IGNGFPMAAVI TT 
PEIAKStAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYMLLKFAKLRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 
PREEVNQIHEDCKHMGLLVGRGS I FSQTFRIAPSMCITKPEVDF 
AVEVFRSALTQHMERRAK 


6877 " 


1 


778 


"GTSPSPARAYAPPTERKRFYQNVSI^EGGFEINLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDTI KY YTMHLTTLCNTS LDN 
PTQRNKDQfc I RAAVKFLDTDT I CYRVEE PETLVELQRNEWDP I 1 
EWAE KRYGVE ISSSTSIMGPSI PAKTREVLVSHLAS YNTWAIjQG 
IEFVAAQLKSMVLTLGL IDLRLTVEQAVLLSRLEEE YQ I QKWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


c^lqgdfknraemidfniriknvtrsda6kyrcevsapseqgqn 

LEEDTVTLEVLVAPAVPSCEVPSSALSGTWELRCQDKEGNPAP 
EYTWFKDGIRLLENPRLGSQSTNSSYTMNTKTGTLQFNTVSKLD 
TGEYSCEARNS VGYRRCPGKRMQVDDLNISGI IAAWWALVIS 
VCGLGVC YAQR KG YFS KETS FQKSNS S SKATTMS ENDFKHTKSF 
II 


I 6879 


3 

• 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRVK ' 

KNSTTDQVYQAIAAKVGMDSTTVNYFALFEVISHSFVRKLAPNE 

FPHKJjYIQNYTSAVPGTCLTIRKWLFTTBEEILLNDNDLAVTYF 

FHQAVDDVKKGYI KAEEKS YQLQKLYEQRKMVMYLNMLRTCEG Y 

NEIIFPHCACDSRRKGHVITAISITHFKLHACTEEGQLENQVIA 

FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKIFTPYFNYMHE 

CFERVFCELKWR KEE Y 


6880 


2110 


1437 


RKDNCTAKEWTFPEAXWNTTARVFSHIRLGMGHVLIIVQCFISS 
MANI YNEKILKEGNQLTES I FIQNSKLYFFGIIiFNGLTLGLQRS 
NRDQ I KN GO FFYGHRAFSVAL I FVTAFQGLSVAF I LKFLDNMFH 
viiPiAU v i TV 1 1 TTVS VL VFD FRPS LE FFLEAPS VhhS I F I YNAS 

KPQVPEYAPRQERIRDLSGNLWERSSGDGEBLERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDSKWEDIHVITGALKMFFRfiLPEPIiFTFNHFNDFVNAIKQEPR " 

QRVAAVKDLIRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 

IAIVFGPTLLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


6882 


1 


850 


GI PEAQLWI YPVKS CKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNM VTARQEPRIi VLISLTCDGDTLTLS AA YTKDLLLPI KT PT 
TNAVHKCRVHGLE I EGRDCGEATAQWITS FLKSQP YRLVHFE PH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 
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PaProline, Q=Glutaraine, R^Arginine, 
S=Serine, T=Threonine, Vs Valine, 
W=Tryptophan, Y«Tyrosine, X»Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\*=possible nucleotide insertion) 








KVKATNFRPNIVISGCDVYAEDSWDELLIGDVELKRVMACSRCI 
LTTVDPDTGVMSRKEPLETLKSYRQC33PSERKLYGKSPLFGQYF 
VLENPGTI KVGDPVYLLGQ 


6883 


2794 


2256 


NSKIiKLNQNLKLFITLTYQVLSLHGWGPGIHL^KEGAFPVTQNR 
ALQLLYDLR YLNI VLTAKGDEVKSGRSKPDSR I EKVTDH LEAL I 
DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 
NSQEPHNILPLASSQIRFGLLPIiSMTSTRKAKSTRNIETKAQYD 
ANC 


6684 


2 


99 


EFERVTAEAVKPRETSE PRAAAQR FC EKFPFL 


6885 


297 


1554 


STGQFWHVTDLHLDPTYH ITDDHTKVCASSKGANASNPG PFGDV 
LCDS P YQLI LS AFDF I KNS GQEAS FM I WTGDS P PHVPVP ELSTD 
TVI NVI TNMTTT IQSLF PNLQVFPAIiGNHD Y W PQDQLS WTS KV 
YNAVANLWKPWLDEEAIS'TLRKGGFYSQKVT'mPNLRI ISLNTN 
LYYGPNIMTLNKTDPANQFEWLESTLNNSQQNKEKVYI IAHVPV 
GYLPSSQNITAMREYYNEKLIDI FQ KYSDVIAGQFYGHTHRDSI 
MVLSDKKGSPVNSLFVAPAVTPVKSVLEKQTNNPGIRLFQYDPR 
DYKLLtJMLQYYLNLTEANLKGES IW KLEYILTQTYDIEDLQPES 
LYGLAKQFTILDSKQFI KYYNYFFVS YDSSVTCDKTCKAFQ I CA 
IMNLDN I S YADCLKQLY I KHNY 


6886 


2 


1341 


QARVSQELKKAAKRTVS ISEGPDTLGDGMRERRETLALAPEPEP 
LEKEACE KWKR P FRS AS ATS LTLSHCVDWKGLLD FKKRRGHS I 
GGAPEQRYQI I PVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS 
LLATGGADRL IH LWNVVG S RLE ANQTLEG AGGS ITS VDFDPSGY 
QVIiAATYNQAAQLWK^GEAQSKETLSGHKDKVTAAKFKLTRHQA 
VTGSRDRTVKEWDLGRAYCSRTINVLS YCNDWCGDHI I ISGHN 
DQKI RFWDSRG PHCTQVI PVQGRVTSLSLSHDQLHLLS CSRDNT 
LKVI DLRVSNIRQVFRADGFKCGSDWTKAVFSPDRS YALAGS CD 
GAL Y I WDVDTGKL ESRLQG PHCAAVNAVAWCY SGSHMVS VDQGR 
KWLWQ 


6887 


1047 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGGCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQS PRSPAGP FRGGTG WWPE PAVCLCVAVGPORLS SPGLVY 
NASGSEHCYDIYRLYHSCADPTGCGTGPDARAWDYQACTEINLT 
PASNNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWLLTS FWGG 
DLRAASNI I FSNGNLDPWAGGGI RRNLS ASVI AVTI QGGAHHLD 
LRASHPEDPASWEARKLEATI IGEWVKAARREQQPALRGGPRL 
SL 


6888 


1 


992 


FVAWKKEIPH!tWTHCLLNPHALVIKTLPTKtRDALFT\AmVI 
NFI KGRAPNHRLFQAFFEEIG I E YSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFI^HKSS^VDGFENKEFKIHLAYLADLFKHLNE 
LS AS MQRTGMNTVS AREKLS AF VRKFPFWQ KRIEKRN FTNF P FL 
EEI I VSDNEGIFIAAEITLHLQQLSNFFHGYFS IGDLNEASKWI 
LDPFLFN I D FVDDS YLMKNDLAELRASGQ I LMEFETMKLEDFWC 
AQFTAFPNLAKTALEILMPFATTYLCELGFSITFTFQNKVPEAA 
LI LSDD IRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLENQ I KE E REQDNS ES PNGRTS PLVSQNNEQGSTLRDLLTTT 
AGKLRVGSTDAGIAFAPVYSMGAPSSKSGRTMPNILDDIIASW 
ENKIPPSKTSKINVKPELKEEPEESIISAVDENNKLYSDIPHSW 
ICEKHILWLKDYKNSSNWKLFKECWKQGQPAWSGVHKKMNISL 
WKAES I S LD FGDHQAD LLNCKD S I ISNANVKEFWDGFEBVS KRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLASHLPGFFVRPDLGPRLCSAYGWAAKDHDI GTTNLHI 
EVSDWNILVYVGIAKGMGILSKAGILKKFEEEDLDDILRKRLK 
DSSE I PGALWHI YAGKDVDKIRBFLQKISKEQGLEVLPEHDPIR 
DQSWYVNKKLRQRLLEEYGVRTWTLIQFLGDAIVLPAGALHQVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Hiatidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N^Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
s=Serine, TsThreonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCIQVTEDFVSPEHLVESFHLTQELRLLKEEINYDDKLQVK 
NI LYHAVKEM VRALKIHEDE VDDMEEN 


" *890 


3 


667 


THACGMWIPLYLHRALWHKTAETCNSPPCGAKDiLIPGAIT^F 
TGFLGVDTGAGATRWCRLKTQRADPLVCAVGMLGS AI FICLIFV 
AAKSSIVGAYICIFVGETLLFSNWAITADILMYWIPTRRATAV 
ALQSFTSHLLGDAGSPYLIGFISDLIRQ8TKDSPLWEFLSLGYA 
L MLCPFVVVLGGMF FLAT ALFFVSDRARAEQQVNQ LAMP P AS VK 
V 


6891 


1980 


1262 


lrihqbllskelkllrgitiesiihiglaagkeqfmOdasnvmO " 

LLLKTQSHLYNMEDNNPEVRQAAAYGLGVMAQFGGDDYRSLCSE 
AVPLLVKVI KRAHSKTKKNVIATENCISAIGKILKFKPNCVNVD 
EVLPHWLSWLPLHEDKEEAIQTLSFLCDLIESNHPW1GPNNSN 
LPKI ISIIAEGKINETINYEDPCAKRLANWRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


~ 6892 - 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV 
FALRAFNVEI4AQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVAIELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRNIKELE 
N YABNTQS S LLYLTL E 1 LG I KDLHADHAAS H I GXAQG I VTCLRA 
TP YHGSRR KVFLPMD I CMLHG VSQEDFLRRNQDKNVRDV I YD I A 
SQAHLHLKHARSFTIKTVPVKAFPAFLQTVSIjEDFLKKIQRVDFD 
I FHPSLQQKNTLLPLYLYI QSWRKT Y 


6893 


1 


042 


DGBRKSMSVERTFSEINKAEEQYSLCQELCSEIAQDLQKKRL.KG 
RTVT I KLKNVNFEVKT RASTVSS WS TAEEI FA1 AKELLKTE ID 
ADFPHPLRLRLMGVRISSFPNEEDRKHQQRSIIGFLQAGNQALS 
ATECTLEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
EAVNKQSFQTSQPFQVLKKKMNENIjE I SENSDDCQIXjTCPVCFR 
AQGC1 S LEALiNKHVDECLDG PS ISENFKMFSCSHVSATKVNKKE 
NVPAS S LCEKQD YEAH 


6894 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 
DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895 


2379 


478 


\rrYVELCDLASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT" 
TRLLALL YALASHKACKLA I LHL I NGT I KGDERYAE I FQDLLAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 
EQLSNSLPNKELMTS I CDCLLATLANS ES S YNCLLTCVRTMM FL 
AEHD YGLFHLKS S LRKNS S ALHS LLKR WS TFSXDTGELAS S FL 
E FMRQ ILNS DT IGCCGDDNGLME VEGAHTS RTMS INAAELKQLL 
QSKEESPENLFLELEKLVLEHSKDDDNLDSLLDSWGLKQMLES 
SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQIiKS 
MWFTPFQAEEIDTDLDLVKVDLIELSEKCCSDFDLHSEIiERS FL 
SEPSSPGRTKTTKGFKLGKHKHETFITSSGKSEYIEPAKRAHW 
PPPRGRGRGGFGQGIRPHDIFRQRKQNTSRPPSMHVDDFVAAES 
KEWPQDGIPPPKRPLKVSQKISSRGGFSGNRGGRGAFHSQNRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PL PPLR PLS S TG YR PS PR DRAS R GRGG LG P S WAS ANS GSGGSRG 
KFVSGGSGRGRHVRS FTR 


6896 


1 


555 


1 V IUKKKiW KUH 1 1 PLiiW VT IDS i KDhliULKNG WLIKTPTK8 

F A VYAATATEKSE WMNHINK CVTDLLS KSGKTPSNEHAAVWVPD 

S E ATVCMRCQKAKFT P VNRRHHCRKCGFWCG PCS E KRFLLPS Q 

SSKPVRICDFCYDLLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 


GDGLMHEVWGI^RPDWETAIQKPLCSLPAGSGNALAASI^IHY " 
AGYEQVTNEDLLTNCTLLLCRRLLSPMNLLS LHTASGLRL FSVL 
SLAWG FI AD VDLESEKYRRLGEMR FT^TFLRLAALRT YRGRLA 
YLPVGRVGSKTPASPVWQQGPVDAHLVPLEEPVPSHWTWPDE 
DFVLVIiALLHSHLGSEMFAAPMGRCAAGVMHLFYVRAGVSRAML 
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SEQ 
ID 

NO : 


beginning 

location 
corresponding 
to first 
amino acid 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H»Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, ToThreonine, v=Valine, 
{^Tryptophan, Y=Tyrosine, X« Unknown, *oStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








i^lfu^kgrhmeyecpylvwpwafrlepkdgkgvfAvdgb 

LMVSEAVQGQVHPNYFWMVSGCVEPPPSW2CPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVASLLKGRQG I YTENERRMQAVI K I RFFKiMLVLI ICW~" 
LSN 1 1 NES LL F YLEMQTDI NGGS LKP VRTAAKTTWF IMGI LNPA 
QG FLLS LAFYGWTGCSLG FQS PRKB I QWESLTTSAAEGAHPS PL 

MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIE1HTASESC 
NKNEGDPALPTHGDL 


6 899 


i *> r\ 


827 


MKVRKNNDAYLLDKNKINMDCFISCFFKKMLTTJbMFSHSGILSL 
LEHGEEYTFSLPCAYARS I LTVPWVELGGKVSVNCAKTGYSAS I 
TFHTKPFYGGKLHRVTAEVKHNITNTWCRVQGEWNSVLEFTYS 
NGETKYVDLTKLAVTKKRVRPLEKQDPFESRR1,WKNVTDSLRES 
E IDKATEHKHTLE ERQRTE ERHRTETGTPWKTRY FI KEGDGWVY 
HKPLWKIIPTTQPAE 


69 OT" 


rs 
-3 


451 


TEVLGSKG IHELRSSTS ALHHALEESASLLTMFWRAALPSTHI P 
VLPGKVGESTERELLELRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQF I V5QLTRTHD VLKKARTNLEVR KLLHQSEAPSLS PTHHHP 
LADLVGDSWPALRFQEK 


SToi 


1 


201 


DD^VQRLETDFJCMTXiGXX3SrLEQWAAWIjDNVMMQALKPYEGRP 
SFPKAARQFLLKWSFYRYHLGFS 


6902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSALEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDbLSLFENN 


G903 


1 


149 


RINQVYRQGPTG I HI LVI DQMVQNFQDESCFLFSTVKAESSDGI 
HULK 


6904 


464 


2092 


MEASLPVSLSCV1ACGDVEGKFDILFNRVQAIQKKSGNFDLLLC 
VGNFFGSTQDAE WEE YKTGI KKAPIQT YVLGANNQETVKYFQDA 
DGCELAENI T YLGRKG IFTGS S GLQI VYLSGTE SLNEP VPG YS F 
SPKDVSSLRMMLCTTSQFKGVDILLTSPWPKCVGNFGNSSGEVD 
TKKCGS ALVSSLATGLKPRYHFAALE KTY YBRLP YRNH I ILQENf 
AQHATR FI ALANVGNPE KK KYL YAFS I VPMKLMDAAELVKQPPD 
VTENP YRKSGQE AS IGKQI LAPVEESACQFFFDLNEKQGRKRS S 
TGRDSKSSPHPKQPRKPPQPPGPCWFCLASPEVEKHLWNIGTH 
CYLAIiAKGGLSDDHVLILPIGHYQSWELSAEWEEVEKYKATL 
RRFFKS RG KWCWFERNYKSHHLQLQ VI P VP I S CSTTDD I KDAF ' 
I TQAQEQQ I ELLE I PEHSD I KQ IAQPGAAYF YVELDTGEKLFHR 
I KKNFPLQFGRE VLAS EAI LNVPDKS D WRQCQ I S KEDEETLARR 
FRKDFEPYDFTLDD 


6905 


1 


226 


VSKTGEAETITSHYLFALGVYRTLYLFNWIWRYHFEGFFDLIAI 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


611 


S YDDHNGH I DFI TAASNLRAKM YS I EPADRFKTKR lAGKit I PAI 
ATTTATVSGL VALEM I KVTGG YP FE AYKNWFLNLAI P I WFTE T 
TEVRKTKIRNGISFTIWDRWTVHGKEDFTLLDFINAVKEKYGIE 
PTMWQGVKMLYVPVMPGHAKRLKLTMHKLVKPTTEKKYVDLTV 
SFAPDI DGDEDLPGP PVR YYFSHDTD 


6907 


2 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIM^RRSQRLTRYSQGDDDGS 
SS SGGSS VAGS QSTLFXDS PLRTLKRKS SNMKRLS PAP QLGPS S 
DAHTSYYSESIiVHEfiWFPPTaQQT.FTrT.MrsnTVMursTri^T Dimnnnom 

GGSESSRASGLVGRXATEDFLGSSSGYSSEDDYVGYSDVDQQSS 
S S RLRS AVS RAGS LLWMVATS PGRLFRLL YW WAGTTWYRIjTTAA 
SLLDVFVLTRRFSSLKTFLWFLLPLLLLTCLTYGAWYFYPYGLQ 
TFHPALVSWWAAKDSRRADEGWEARDSSPHFQAEQRVMSRVHSL 
ERRLEALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHBD 
TXiALLEGL VS RREAALKED FRRE TAAR IQEELSALRAEHQQDS E 
DL FKKI VRASQESEAR I QQLKS EWQSMTQES FQESS VKELRRLE 
DQLAGLQQELAALALKQSSVABEVGLLPQQIQAVRDDVESQFPA 
WISQFLARGGGGRVGLLQREEMQAQLRELESKILTHVAEMQGKS 
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SEQ 
10 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C-Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, Lysine, 
LsLeucine, M=Methionine, N«Asparagine, 
P=Proline f Q«Glut amine, R=Arginine, 
S^Serine, TVThreonine, VaValine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /opossible nucleotide deletion, 
\sp03sible nucleotide insertion) 








AREAAAS Ij S LTLQ KEG V IG VTBEQ VHH I V KQALQR YS EDR I GLA 
D YALESGGAS V I STRCS ET YETKTALLSLFG I PL WYHS Q S P RVI 
LQ PDVHPGNCW AFQG PQG FAWRLS ARI RPTAVT LEHVP KALS P 
NSTISSAPKDFAIFGFDEDLQQEGTLLGKPTYDQDGEPIQTPHF 
QAPTMAT YQ WE LR I LTNWGH PE YTC I YRFRVHG E PAH 


690B 


3 


780 


QVPSAAWLMAVCGLGSRLGLGSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSSECTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELVTEKGHTFAEELQK1QCTLQDV 
GSALATPCSSAREAHLKYTTFKAGPI LELEQWI DKYTSQLP PLT 
AF I LPSGGKI S S ALHFCRAVCRRAERR WPL VQMGETD ANVAK F 
LNRLS DY LFTLARYAAM KEGNQEK I YKKNDPS AES EGL 


$909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
SPYLRGTIKMMQAVRQAFQDQDDRRTWDGRPLTE^AATFDDCliYA 

LCWDTT ICRS SfYTClF WON T ATMTFPPFT.SP AY1 .T ^RfiMDDQBMC 

LYC 


6910 


1 


1066 


L UPWUT D<3 Y Y VC2 V fTV Ya P X .M TI/T.VW T WTPW2 D T\ T . WiTI? DM VCV 
xj v a v v v iua x x (gAuv x/irulNx Viiinif X JfcTlvjjf ULiXVji A Jlrn Zr X 

L INGFLNFNVAFALALLVLPLTS LMBYLLQRFHVQNLGHPYWLT 
LAPMYIWFI I FFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWI^ALGTVFLFGtiLSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PS S FLLPDN WQLQF I PS E FRGQLPKP FAEGPLATRI VPTDMNDQ 
NLE E PSR YI D I S KCH YL VDLDTMRETPRE PKYS SN KEEW I S LAY 
RPFIiDASRSSKljI.RAFYVPF'L<SnfiVTVYVNYT T T KTiT D V 

KSGG 


6911 


1184 


966 


GEnAEEMETGNVANLISIFGSSFSGLLRKSPGGGREEEEGEESG 
PEAAEPGQICCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 


AMKPVETHSFQMLFTILSTGSALKAQSYEDAYRCIKSSILLGSI 
SGGTDI ISCFMGHNFSL P VYKGE IQARNLGt4AVEAWNEEGKAVW 

YCR I NPKTGG I VMLGRSDGTLNPNGVRFGS SE I YN I VES FE EVE 
DSLCVPQYNKYREERVILFLKMASGHAFQPDLVKRIRDAIRMGL 
SARHVPSLI LETKGI PYTLNGKKVEVAVKQI IAGKAVEQGGAFS 
NPETLDLYRJD I PELQGF 


6913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 

CINI^GIxARPDTLLSGSTSGWIODMFNWVL^ 

ERNVDDGHI INI NSMSGHRVLPLS VTHFYSATKYAVTALTEGLR 

QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYBQMK 

CLKPEDVAEAV I YVLST P AH I Q I GD I QMRPTEQVT 


"■"6915 


254 


6S2 


GRSLS FKTFLI WVLI S I YQGG ILMYGALVLFESEFVHWAI S FT 

AIjILTELI^VALTVRTWHWLMWAEFLSLGCYV^ 

VAF I TTVTFLW KVS AI TWS C LPL YVLKYLRRKLS P PS YCKLAS 


6916 


254 


652 


GRSLS FKx'FLI WVLISI YQGGILMYGALVLFESEFVHWAI S FT 
AL I LTE LLMVALT VRTWHWLMVVAE FLSLGCYVS S LAFLNE Y FD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLS P PS YCKLAS 


6917 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHVVAISFT 
ALILTELLMVALTWTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAFITTVTFLWKVSAI TVVS CLPLYVLKYLRRKLS P PS YCKLAS 


6918 


28 


921 


PEAGTRS WRE PD P EDLRRFLLS AACRS FPQWL PGGGGGQ VS S CS 
DTD VP YLLLAVKS E PGRFAERQAVRE TWGS PA PG I R LLFLLGS P 
VGEAGPDLDSLVAWESRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRHCPTVSFVLPJ^QDIX^FVHTPAlxIAHLRALPPASARSLYLGE 
VFTQAMPLRKPGGPFYVPESFFEGGYPAYASGGGYVIAGRLAPW 
LLRAAARVAPFPFEDVYTGLCIRALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPQAS IPXWKQLQDPRLQC 



570 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
araino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=hysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, v» Valine, 
W«=Tryptophan, Y«Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6919 


850 


41 


QGRR ELSGS VFCPF I QQEPKJ£MLT1»SE YHERVRSQGQQLQQLQA "* 
ELDXLHKEVSTVRAANSERVAICUVPQRLNEDFVRKPDYALSSVG 
AS IDLQKTSHDyADRNTAYFWNRFSFWNYARPPTVILEPHVFPG 
NCWAFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 
RD FAVF FLLS FFTHQGLQVYD ETE VSLGKFT FD VEKSE I QTFHL 
QNDP PAAFP KVK I Q I LS N WGH PR FTCL YR VRAHG VRTS EGAEGS 
AQGPH 


6920 


1418 


591 


EAQGPSKVHLTLKKKK 


6921 


2 


1711 


MNATRS EEQ FH VI NHAEQTLRKMENYLKJ2 KQLCDVLL IAGHLR I 
PAHRLVLS AVS DY FAAM FTNDVLEAKQE EVRMEG VDPNALNS L V 
QYAYTGVLQIjKEDTIESLIiAAACLLQLTQVIDVCSNFIjI kqlhp 
SNCLGIRSFGDAQGCTELL2^VAHKYTMEHFIEVIKNQEFLLLPA 
NEISKLLCSDDINVPDEETIFHALMQWVGHDVQNRQGELGMLLS 
YIRLPLLPPQLLADLETSSMFTGDLECQKLLMEAMKYHLLPERR 
SMMQS PRT KP RXST VGAL» YAVGGMDAMKGTTT I E KYDLRTNSWL 
H IGTMNGRRLQFG VAVI DNiCLYWGGRDGUCTLNTVE CFNP VG K 
IWTVMPPMSTHRHGLGVATLEGPMYAVGGHDGWSYLNTVERWDP 
EGRQWMYVASMSTPRSTVGWALNNKLYAIGGRDGSSCLKSMEY 
FDP HTNKWS LCAPMS KRRGG VG VATYNG FLYVVGOHDAPAS NHC 
SRLSDCVERYDPKGD3WSTVAPLSVPRDAVAVCPK3DKLYWGG 
YDGHT YLNTVES YDAQRNE WKBEVP VNIGRAGACWWKLP 


6922 


1075 


369 


LTPPAGIRHEVRDRBREREREREREKFPLDSTGSELKO^lriSiT " 
GLPPAMQKVMYKGIAPEDKTLREIKVTSGAKIMGGGSTINDVLA 
VNT P KDAAQQDAKAE ENKKE PLCROKOHRKVT.DKG KP ktwmdqu 

KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGSI KNWS E PIEGHED YHMMAFQLGPTEAS YYWVYWVPTQY 
VDAI KDTVLGKWQYF 


6923 ~ 


2469 


1660 


IiGL FCI LP I DTLCAVLERDTLS IRE S RLFGA WRWAEAE CQRQQ " 

LPVTFGNKQKVLGKALSLIRFPLMTIEEFAAGPAQSGILSDREV 

VNLFLHFTVNPKPRVEYIDRPRCCLRGKECCINRFQQVESRWGY 

SGTSDRIRFTVNRRISIVGFGLYGSIHGPTDYQVNIQIIEYEKK 

QTLGQNDTGFS CDGTANTFRVM FKE P IE I L PN VC YTACATLKG P 

DSHYGTKGLKKWHET PAAS KTVFFF FSSPGNNNGTS I EDGQ IP 

EIIFYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGAIAKKP YNPI IGETFHCSWEVP 
KDRVKPKRTASRSPASCHEHPMADDPSKSYKI4RFVAEQVSHHPP 
I S CF YCE CEEKRLC VNTH VWTKS KFMGMS VGVSM IGEG VLRLLE 
HGEEYVFTLPS AYARS I LT I PWVELGGKVS INCAKTG YSAT VI F 
HTKPFYGGKVHR VTAE VKilNPTNTI VCKAHGE WNGTLE FT YNNG 
ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVTRYIiRLGDI 
DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSG I LQ 
SPLESTLMGLEVQSFPV 


6925 
6926 


2 
1 


1653 
733 


RGGAAGAAMBPDSVIEDKTIELMCSVPRSLWLGCANLVESMCAL " 
S CLQ SMPS VRCLQ I SNGTS S VI VSRKRPS EGN YQKE KDLC I KYF 
DQWS E S DQVEFVEHLI SRMCH YQHGH I NS YLKPMLQRDFITALP 
EQGLDHIAENILSYLDARSLCAAELVCKEWQRVISEGMLWKKLI 
ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNSFYRSLYPKI 
IQDI ETI ESNWR CGRHNLQR IQCRSENS KGVY CLQ YDDE Kl I SG 
LRDNS IKIWDKTS tiECLKVLTGHTGSVLCLQYDERVI VTGSSDS 
TVRVWDVNTGEVLNTLIHHNEAVI-HLRFSNGLMVTCSKDRSIAV 
WDMASATDITLRRVLVGHRAAVNWDFDDICYIVSASGDRTI KVW 
STSTCEFVRTLNGHKRGIACLQYRDRIiWSGSSDNTlRLWDIEC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKIKVWDLQAALDP 
RAPAS TliCLRTLVEHSGR VFRLQFDE FQ 1 1 S S SHDDTI L I WDFL 
NVPPSAQNETRS PSRTYTY I SR 

SGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
co rirst 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, CoCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, i=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glu t amine , R»Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGYPLPTPDTSPLDGVDPDPAFFAAFMPGDCPAAGTYSYAQVSD 
YAGPPEPPAGPMHPRLGPEPAGPSIPGLLAPPSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 
D PSQPAELLG E VDRTE FEQ YLHF VCKPEMGL P YQGHDSG VNLF D 
SHGA I SSWS DASSAVY YCNYPDV 




2 


1484 


bTLCGDIQLMliAQNANNRAAHLEBFHYQTKEDQEILHSLHRESS 
CQGFAWATDL5TDLESQLSVSCKCYEAANBILQFRDLKSQNPEH 
YVQVIiKRMGNIRNEIGVFYMNQAAALQSERLVSKSVSAAEQQLW 
KKS FS C FE KG I HNFES I EDATNAALL LCNTGRLMRI CAQAHCGA 
GDELKREFSPEEGLYYNKAIDYYLKALRSLGTRDIHPAVWDSVN 
WBLSTTYFTMATLQQDYAPLSRKAQEQIEXBVSEAMMKSLKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLH YS KAAKLFQLLKDAPCELLRVQLER VAFAEFQMTS QNS 
NVGKLKTLSGALDIMVRTEHAFQLIQKELIEEFGQPKSGDAAAA 
ADAS PSLNREEVMKLLS I FESRLS FLLLQS IKLLSSTKKKTSNN 

IEDDTILKTNKHIYSQLLRATANKTATLLERINVIVHLLGQLAA 
GSAASSNAVQ 


6928 
^929 


1086 


777 


EAIDL INNI»I»y VKMRKRYS VDKTLSHP WLQD YQTWLDLRELECK 
1GER YI THE S DDLRWEKYAGEQG LQ Y PTHL IN PS A3HSDT P ETE 
ETEMKALGERVSIL 


- 


1749 


607 


RDQRGYRDDRS PAREPGDVS ARTRSGGGGGRSATTAMPP P VPNG 
NLHQHDPQDLRHNGNVWAGRPSCSRGPRRAIOKPQPAGGRRSG 
RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHLAGLQFRE 
QEVRNOGQARTNSTSAQKNERESIRQKLALGS FFDDGPG I YTS C 
SKSGKPSLSSRLQSGMNLQICFVNDSGSDKDSDADDSKTETSLD 
TPLS PMS KQSS S YSDRDTTE EESESLDDMDFLTRQKKLQAEAKM 
ALAMAKPMAKMQVEVE KQNRKKS P VADLLpHM PH I S ECLM KRS L 
KPTDLRDMTIGQLQVIVNDLHSQIESLNEELVQLLLIRDELHTE 
QDAMLVDIEDLTRHAESQQKHMAEKMPAK 


6930 
" 6931 


131 


545 


FKDTANVFVSIjFQMRNNFRHYFIBPSQLKLFYDVITWJVtQVAI 
S YTWP FVLLS I KPSLTFYSS WYYCLHILGII*VLH*LPVKKTQR 
RK^HENIQLSQSKKFDEGENSWQNSFSTTNNVCNQNQEIASR 

hsslkq 




2 


659 


FVERLPNRPACLLVASGAAEGVSAQSFLHCFTMASTAFNLQVAT 
PGG KAME FVDVTE S NAR WVQD FRLKA YAS PAK LE S I DGAR YHAL 
LI PS CPGALTD LAS SGSLAR I LQHFHSES KP ICAVGHGVAALCC 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVKDSG 
ACFS ASE PDAVHWLDRHL VTGQNAS S TVPAVQNLL FLCGS RK 


6932 
" 6933 


2 


1131 


FVDSPGQGEQAEEEEGGIQMNSRMRAHS PAEGAS VESSS PGPKK" ~ 
SDMCEGCRSLAAGHPGYISHDKETSIKYVSHQHPSHPQLFSIVR 
QACVRS L SCE VCPGREGP I FFGDEQHGFVFSHT FF I KDS LARG F 
QR WYS 1 1 TIMMDRI YL I NS WPFLLGKVRG I IDELQG KALKVFEA 
E QFGC PQRAQ RMNTAFTPFLHQRNGNAARS LTSLTSDDNLWACL 
HTSFAWLLKACGSRLTEKLLEGAPTEDTLVQMEKLADLEEESES 
WDNSEAEEEEKAPVLPESTBGRELTQGPAESSSLSGCGSWQPRK 
LP VFKSLRHMRQ VGGRGTAHH ELRRRANHGLCLPTRLAS GPS TL 
KTLQEVTDS LLGGWLMAQGVGGI I 




1431 


890 


SLNL^CTLPPFPHQYPAGYPSDKEGKKPKGQSKKQPSGTTKRPI 
SDDDCPSASKVYKASDSASAIEAFQLTPQQQHLIREDCQNQKLW 
DE VLSHL VEG PNFLKKLEQ S FM C VCCQELVYQ P VTTECFHNVCK 
DCLQRS FKAQV FS C P ACRHDLGQN YI M I PNE I LQTLLDLFFPGY 
SKGR 


6934 


3030 


££88 


DRDHSQCGGlRRVAIiARVSSVKLISKAKIRTVKMTPl IVLAFIV 
CWTPFFFVQMWSVWDANAPKBASAFIIVMLIiASLNSCCNPWIYM 
LFTGHLFHELVQRFLCCSAS YLKGRRLGETSASKKS NS S S FVLS 
HRSSSQRSCSQPSXA 
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SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, CoCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, Q=01ycine f 
HoHistidine, Ielsoleucine, K^Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, "^Threonine, V= Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\»poseible nucleotide insertion) 


6935 


886 


543 


NSALYVAGGNDGTS CLNS VERYS PKAGAWES VAPMNIRRSTHDL 
VAMDGWLYAVGGNDGSSSLNS IEKYNPRTNKWVAASCMFTRRSS 
VGVAVLELLNPP PPS S PTLS VSSTSL 


6936 


1347 


567 


RSHRRQFLSRALLEFFGKSHPPPHRLFRKSLNVGLHYSHIPFLT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRAiLQLRGLDPS 
LPSPLPNLGPQGPALTPEQENILHTTQTDCYNNLAACLIjQMEPV 
NYERVREYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYLL 
AAVNRQPKDANVRRYLQLTQSELSSYHRKEKQLYLGMFG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDtf 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PCPPLEERAGCLEYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TSPHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVC VDCQP PAMNS VS LRCS GDGLDSDGNQTLHWQAI GNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


NSRKLEIiAERVDTDFMQLKKRRQSSEKENDSGTLDTVGAVWDH 
EGNVAAAVS SGGLALKHPGR VGQAALYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTILARECSHALQAEDAHQALLETMQNKFISS 
P FtASEDG VLGGVI VLRS CR CS AE P DS SQNKQTLLVE FLWSHTT 
ESMCVGYMSAQDGKAKTHISRLPPGAVAGQSVAIEGGVCRLGEP 
SELTLQAECEASQRHFRT 


6939 


3 


810 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS 
GYESIiRRDSEATGSASSAPDSMSESGAAS PGARTRSLKS PKKRA 
TGLQRRRLIPAPLPDTTALGRKPSLPGQWVDLPPPLAGSLKEPF 
E I KVYEI DDVERLQR PRPTPREAPTQGLACVSTRLRLAERRQQR 
LRE VQAKHKHLCEELAETQGRLMLE PGRWLEQFEVDPELEPES A 
EYLAALERATAALEQCVNLQCAHVMMVTCFDISVAASAAI PGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 
WKGSS PLGPAGLGAEE PAAGPQLP S WLQPERCAVFQCAQ CHAV 
LADS VHLAWDLSRSIiGAVVFSR VTNNVVLE AP FLVG I EGS LKGS 
TYNLLFCGSCGrPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELXEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI I GS NV1ALAE AQRQAE ALG YQA 
WLSAAKQGDVKSMAQ FYGLLAHVARTRLTPS MAGAS VEEDAQL 
HELAAELQ I PDLQLE EALETMAWGRGP VCLLAGG EPTVQLQGSG 
RGGRNQELALRVGAELRRWPLGPIDVLFLSGGTDGODGPTEAAG 
AWVTPELASQAAAEGLDIATFLAHNDSHTFFCCLQGGAHLLHTG 
MTGTNVMDTHLLFLR PR 


6942 


1 


246; 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCLLGDRLYADGGYDG 
QTYLNTMESYDPQTNEWTQMASLNIGRAOACVVVIKQP 


6943 


1 


739 


PMATGDGAKTLAIHVKALTADS I RITWKATLPAS S FRLSWLRLG 
HS PAGGSITETLVQGDKTBYLLTALEPKPTYI I CMVTMETTNA Y 

V/UJ£illfVUiK/uSJL ALlo xtar 1 1 lJbwUtQNAGPMASLPliAGIIGGA 

VALVFLFLVLGAI CWYVHQAGELLTRERAYNRGS RKKDD YMESG 
TKKDNS ILEIRGPGLQMLPINPYRAKEEYWHTI FPSKGSSLCK 
ATHT I G YGTTRG YRDGG I PDIDYSYT 


6944 


960 • 


156 


VAN I LLNGVK YE S ELTGS S E RAEQ PLS VGRLCS T I CNM P KALRT 
LCVNHFLGWLSFEGMLLFYTDFMGEWFQGDPKAPHTSEAYOKY 
NSGVTMGCWGMCI YAFSAAFYSA IJjEKLEEFLSVRTIjYFIAYLA 
FGLGTGIiATLSRNLYVVLSLCITYGILFSTLCTLPYSLLCDYYQ 
S KKFAGSSADGTRRGMGVD I S LLSCQ YFLAQ I LVS LVLGPLTSA 
VGSANGVMYFSSLVSFLGCIiYSSLFVIYElPPSDAADEEHRPLL 
LNV 



573 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 

( NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
<A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
Ii=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=rArginine, 
S=Serine, ivrhreonine, v»valine, 
WoTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»posBible nucleotide insertion) 


€945 


2067 


179 


EGEDRGLPRTMGAALGTGTRLAP W PGRACGAL P RWT PTAPAQGC 
HSKPGPARPVPLKKRGYDVTRNPHIiNKGMAFTLBBRLQLGIHGL 
I P P CFLS QDVQLLR I MRYYERQQS DLDKYI 1 LMTLQDRNE KLF Y 
RVLTS DV E KFMP I VYT PTVGLACQH YGLTFRR PRG LF I TIHDKG 
HLATMLNSWPEDNI KAWVTDGERI LGLGDLGC YGMGI PVGKLA 
LYTACGGVNPQQCLPVLLDVGTNNEBLLRDPLYIGLKHQRVHGK 
A YDDL LDE FMOAVTDKFG I NCLIQFED FANANAFRLLNK YRNKY 
CM FNDD I QGTAS VAVAG I LAALR I TKN KLSNHVFGFQGAGE AAM 
G \ I AHLL VMALE \ KEGVPKA\ EATRKI W\MVDF \ KGL I VQ GRDH 
LNHEKEMFAQD\HPEVNSLEEWRIiVKPTAIIGVAAIAEA\FTE 
QILRDMASFHERP\IIFALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS\GSPF+GVLIWEMGKTFIPGGRGNNA*RVPRGWQLGVHSPG 
GDPGHIP\DEIFLPDSRAKLPQEVSEQHLSQGRLYP\PLST\IR 
NVFLRIAIKVFD*GYKHNl»V\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDS YI WAQGKAMKVQTV 


6946 


133 


2551 


5 CE YSGIT VA PGDPCPG VAHLLAPSMASDTPES LMALCTDFCLR 
NLDGTLGYLLDKETLRLHPDIFLPSEI\CDRLVNEYVELVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\liVQD\QD\LE 
AIRKQDL\VEL\YLTN\CEKLSAKSLOTI^SFSHTLGVP*AFFG 
C\TNILLLRKENPGGL/CEDEYLFNPTCQVLVKDFTFEGFSRLR 
F \ LKLGRM IDW VP VES \LLRPLNSLAALDLSG IQTSDAA\ FLTQ 
WKDSL\VSr*VL\YNMDLSDDIIIR\VIVQLHKIjRHLDISRDRLSS 
YYKFKLTREVLSLFVQKLGNLMSLDIS9\HMILEMCSISKIGKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 
IPAYKVSGDKNEEQVLNAIEAYTEHRPEITSRAINLLFDIARIE 
RCNQLLRALKLVITALKCHKYDRNIQVTGSAALFYLTNSEYRSE 
QSVKLRRQVIQWLNGMESYQEVTVORNCCLTLCWFSIPEELEF 
QYRRVNELLLSILNPTRQDBSIQRIAVHLCNALVCQVDNDHKEA 
VGKMGFVVTMLKLTQKKLIiDKTCIDQVMEFSW\SALWNITDETPD 
NCEMFLNFNGMKLFLDCLNEFPEKQELHRNMLGLLGNVAEVKEL 
RPQLMTSQFISVFSNLLESKADGIEVSYNACGVLSHIMFDGPEA 
WGVCEPQREEVEERiMWAAIQSWDINSRRNlNYRSFEPILRLLPQ 
GISPVSQHWATWALYNIiVSVYPDKYCPliLIKEGGMPLLRDIIKM 
ATARQETKEMARKVIEHCSNFKEENMDTSR 


6947 


2 


1*82 


TSVSTI PRGLASARPQSRS WRCCPVWRRSPGRARGRGLKMLNVP 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 
DELMRAAGSDGTELFDQVHRW VN YESMLKECLVGRMAI KPAVLK 
DYREEEKKVLNGMLPKSQVTDTLiAKEGPSYPSYDWFQTDSLVTI 
/EHIY*TEGYQFRIjNNS*SSE*FLYSRNNY*GLLISYTYW/R*A 
MRFRKIFLCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSL 
IPRKDTGLYYRKCQLISKEDVTHDTRLFCLMLPPSTHLQVPIGQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 

iyptglftpeldrlqigdfvsvsspegnfkiskfqeledlflla 

AGTGFTPMVKILNYALTDIPSLRKVKLMFFNKTEDDIIWRSQLE 
KLAFKDKRLDVEFVIiSAPISEWNGKQGHlSPALLSEFLKRNLDK 

SKVLVCI CGPVPFTROGVR T.T.HDT .M P<5 V7J I? tttc B»r * 


6948 


104 


58 


PDGAKSFFPDEYFTCSSLCLSCGVGCkKSMNHGKEGVPHEAKSR 
CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWMGLAKY 
AWSGYVIECPNCGVVYRSRQYWFGNQDPVDTVVRTEIVHVWPGT 
DGFLKDNNNAAQRLLDGMNFMAQSVSELSLGPTKAVTSWLTDQI 
APAY W RPNS Q I LS CNKCATS FKDNDTKHHCRACGEG FCDS CS SK 
TRP VPERGWGPAPVRVCDNCYEAR/TRPVSCYRGTSGR * RRRRT 
QETVB 


6949 


152 


465* 


GLRLCLSRPLTRPGDDSVGGSAMASGAGGVGG&3GGKI RTRRCH " 
QGPIKPYQQGRQQHQGILSRVTESVKNIVPGWLQRYFNKNEDVC 
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ID 
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to first 
amino acid 
residue of 
amino acid 
sequence 



"695F 



2585 



~695T 



T940- 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



411 



~53T 



Amino acid segment containing signal peptide 
(A-Alaninc, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
LoLeucine. M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ToThreonine, V»Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *«Scop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide in sertion) 
SCSTDTSEVPRWPENKEDHLVYADEESSNITDGRITPEPAVSNT 
EEP$TTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 
S AFP I GSS G FS LVKE I KDSTSQHDDDNISTTSGFSSRAS DKD I T 
VS KNTSLP P L WS PEAE RS HS hS QHTATSSKKPAFNLS AFGTLS P 
SLGNSSILKTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPyQA 
PVRRQMKAKQLSAQS YG VTSSTARRILQSLEKMSS PLADAKRI P 
S I VS S PLNS PLDRSG I D I TDFQAKRE K VDSQ YP P VQRLMT PKPV 
SIATNRSVYFKPSLTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 
REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 
LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 
OALTNKVQMTSPSSTGSPMFKPSS PI VKSTEANVLPPSS IGFTF 
SVPVAKTAELSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 
EGPFRPAEILKEGSVLDILKSPGFASPKIDSVAAQPTATSPWY 
TR PA IS S FS SSG IGFGES LKAGSS WQ CDTCL LQNKVTDNKC I AC 
QAAKLSPRDTAKQTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 
WDCDTCLVQNKP EAI KCVACETPKPGTCVKRALTLTvVs ESAET 
MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 
VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 
ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 
S FKFGVS S S SSG PSQTLTS TGNFK FGDQGGFKI G VS SDSG Y I NP 
MSEGF*FSKHIVGFKFGVSSESKPBEVKKOSKNDNFKPGLSFGL 
SNPVFLT PFQFG VSNI/3QE EKKEELIjKS S CAGFR FGTG VINSTR 
VPANTIVTSENKSSFNLGTIETKSVSVAPLKCQTSEAKKEEMPA 
TKGGFSFGNVBPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 
KLTMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 
SEQPAKATFAFGAQTNTTADQGAAKPDLSYLNNSSSSSSTPATS 
AGGG\IFGSSTSSSNPPVATFVFGQSSNPGSSS\AFGNTAESST 
SQSLLFSQDSKLATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 
ATTTSSSAGSSFVFGTGPSAPSASPAFGAMQTPTFGQSQGASQP 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
SAFGSGTTPNSSSAFQFGSSTTNFNFTNWSPSGVFTFGANSSTP 

AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRK 

PKPG S RS QLCK KAGE RGAV &ftGGL SRRTRAM » IMDELHYQDTDS 
DVPEQRDSKCKVKWTHE EDEQLRAIiVRQ FGQQDW KFLASHFPNR 
TDQQCQYRWLRVLNPDLVKGPWTKEEDQKVIEIiVKKYGTKQWTL 
IAKHLKGRU3KQCRERWHNHLNPEVKKSCWTEEEDRI ICEAHKV 
LGNRWAE I AKML PGRTDNAVKNHWNST I KRKVDTGG FLS ES KDC 
KPPVYLLLELEDKDGLQSAQPTEGQGSLLTNWPSVPPTIKEEEN 
SEEELAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 
ETSLPYKWVVEAANLLIPAVGSSLSEALDLIESDPDAWCDLSKF 
DLPEEPSAEDSINNSLVQLQASHQQQVLPPRQPSA\LVPSVTEY 
RLDGHTISDLSRSSRGEL I P I S PSTE VGGSGIGTPPSVLKRQR K 
RR VALS PV TENSTS LS FLDS CNS X»TP KS TP VKTLPFSPS Q FLNF 
VJNKQDTLELESPSLTSTPVCSQK^A^TTPLHRDKTPLHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPIiPQTPHLEEDLKE 
VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSIALDIV 
DEDMKLMMSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNQGF 

LQAKPEKAAVAQKPRSHFTTPAPMSSAWKTVACGGTRDQLFMQE 
KARQLLGRLKPS HTSR TLI LS 

AG PDDTMKRS bQAL Y CQLLS FLL I IiALTEALA FA IQE PS PRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ*PPPILKAP/SSTGPAPAAMAT 
TSSKPEGRPRGQAAPTILLTKPPGATSRPTTAPPRTTTRRPPRP 
PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
LGKIFQIYKGNFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rieuictea ena 
nucleotide 
location 
c or r e sponding 
to first 
amino acid 
residue of 
amino acid 


Amino acia segment containing signal peptide - 
(Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l>Isoleucine, K=Lysine, 
L=Leucine, M=«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R*Arginine, 
S=Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X»Unknown f *=Stop 
Codon, /^possible nucleotide deletion, 
\~posoible nucleotide insertion) 


S9S2 






TVPSNTSWAPTTTSLGPAKDKPGLRRAAQGGGSTFTSQGGTPDA 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 
S RPLSTS SG VFT AATGPTPAAFDTS VSA PSQGI PQGAS TTPQAP 

THPSRVSESTISGAKEETVA\PSP»PTGCPVLSPQWYPQPQAIS 
STAWSPPGPOSLGQQGTSPMWPR6TNRSTBPPSA*ARWISPO*S 
WP S ACP S P P \ LCPADG VLHEEEE EDRQ PGBQ PEA YGNNTHHPGT 

TFQOAC\RGAAPGEIPVPLKPLRTQLSBPRSPANGDYRDTGMVP 
C 


6953 


6£8 


304 


PES EGE SGE MTDRYT I HS Q LEHLQS KV I GT \ ATP'I'PP 5GSG \ CE 
PTPRLVLLLHGPLRPSQLLRHCGE * EQSASPLQLDGKDASALWT 
_ ASRQARGELRLCLTTAVRGTSPS VS PVCQSS 


6954 


1512 


349 ™ 


MWGKTRALASGKHVPFGKtjTNPNKSyVHCDS»G»*RRETTQDBS 
PS PHFRGKMGG W \ KI/E KELENTBQP VGGNEG * EHE VTGNLNS D 
PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 
GRALSS PGS LGRHLL I HS E DQRSNCAVCGAR FTS HATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKDIAFSPPVYPAGILLVCNNCAA 
YRKLLEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 
ETPEEREVRRMRDREAKRLQRMQETDEQRARRLQRDREAMRLKR 
AIETPEKRQARLIREREAKRLKRRLEKMDMMLRAQFGQDPSAMA 
ALAAEMNFFQLPVSGVELDSQLLGKMAFEEQNSSSLH 


6955 


*19 

" 196T 


1 


PPPPFHPShfKEAGT*AG*KRSGDSECSPPVEU*A*TRAAAQN 

* PQR * RWTEGNS PQAS AVATPGQGAS PAAPRCT P * PSRRH RRLP 

PGARPPAG*AAPAPTKPWLAGPASAPQPGAAPLSPPAPPLIRTR 

* CAGAAARGR PRRDRS PRPRTPGGCS WSEPRTPPAVSASAQTPS 

DAG*AGGR*GQRQRPSTGR*PPGVGGAGRSHRREGTIPGNPHPR 

AS*RAGWQR*PGP/REWGI**EPQGBEMSGPGGPGGAPPNOVGSS 
VMQAMSTGI 


4956 




782 


P^RRQVRAQVAGAPVGHWGTRARQVKTXSGRRRARRTMPFLGQD 
WRS PG WS WI KTEDG WKRCES CSQKLERENNHCNI SHS 1 1 LNS ED 
GEIFNNEEHEYASKKRKKDHFRNDTNTQSFYREKWIYVHKESTK 
ERHGYCTLGEAFNRLDFSSAIQDIRKFNYWKLLQLIAKSQLTS 
LSGVAQKNYFNILDKIVQXVLDDHHNPRLIKDLLQDLSSTLCIL 
/N*RSREVCISGKHQYLDLPIRNYSRLATTATGSSDD*ASE\NG 
LTLSDLPLHMLNN I L YR PS DG WDI I TLGQVTPTLYMLS EDRQLW 
KKLCQYHFAEKQPCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 
GDTLHFCRHCS IL F WKDSGH PCTAAD PDS CFTP VS PQHF 1 DL FK 
F 




8605 


3839 

I 


yXbTS I FAS PI'S P PVLGE^ VliQDKfSFDLNNGSDAEQEEMETQSS 

DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 

PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 

TSPKASPVTSPAAAFPTASPANKDVSSFLETTADVEEITGEGLT 

ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 

YYGPCGKRMKQFPEVI KYLSRNWHS VRREHFSFSPRMP VGD FF 

EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 

PKVKRGRGRPPKVKITELLNKTDNRPLKKLEAQETLNEEDKAKI 

AKS KKKMRQKVQRGECQTT IQGQARNKRKQETKSLKQKEAKKKS 

KAEKEKGKTKQEKLKEKVlCRBKKEKVKMKEKEEVTKAKPAr * a n 

KTLATQRRLEERQRQQMrLEEMKKPTEDMCIiTDHQPLPDFSRVP 

GLTL PSGAFSDCLT I VEFLHS FGKVLG FDPAKDVPSLG VLQEGL 

\j CQGDSLGE VQDLLVRLLKAALHD PGFPS YCQSLKILGEKVSE I 

PLTRDNVSEILRCFLMAYGVEPALCDRLRTOPFQAQPPQQKAAV 

U^FLVHELNGSTLIINEIDKTLESMSSYRKWKWIVEGRIjRJRLKT 

^KRTGRSEVEMEGPEECLGRRRSSRIMEVTSGMEEEEEEESI 

VWPGRRGRRDGEVDATASS I PELERQ IEKLS KRQL FFRKKLLH 

3SQMLRAVSLGQDRYRRRYWVLPYLAGIFVEGTEGNLVPEEVIK 
CETDSLKVAAHASIxNPALFSMKMEIAGSNTTASSPARARGRPRK 
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Amxno acid segment containing signal peptide 
(A=Alanine, OCysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=*Proline, Q=Glutamine, R«Arginine, 
S^Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKPGSMQPRHLKSPVRGQDSEQPQAQLQPEAQLHAPACjPQPQLQ 
LQLQ SH KGFLEQEGS PLSLGQSQH DLSQS AFLSWLS QTQSHSS L 
LSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSPQQIASSKPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 
FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALHPR 
G I R EKALHKHLNKHRD FLQE VCLRPS ADP I FEPRQLPAFQEG I M 
S WS PKEKTYETDLAVLQ W VEELEQR VI MSDLQ I RG WTC PS PDST 
REDLAYCEHLSDSQEDITWRGRGREGLAPQRKTTNPLDIiAVMRL 
AALEQNVERRYLREPLWPTHEVVLEKALLSTPNGAPEGTTTEIS 
YEITPRIRVWRQTLERCRSAAQVCLCLGQLERSIAWEKSVNICVT 
CLVCRKGDNDEFLLLCDGCDRGCHIYCHRPKMEAVPEGDWFCTV 
GLAQQVEGEFTQ KPGP P KRGQKRKSG YS LNFS EGDGRRRRVLLR 
GRE S PAAGPR YS EEQLS PS KRRRLSMRNHHSDLTFCE I ILMEME 
SHDAAWP FLE PVNPRLVSG YR R 1 1 KNPMDFSTMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 
YQG KQGQS VRQGRWG VTLWHLP PT FQTKTCHFH LLMLP WVOTQV 
RYNPDF 


6957 
6958 


82 


3514 


HLIVAMPEWkkeENEVPAPAPPPEEPSKEKEAGTTPAKDWTLV 

ETPPGEEQAKQNANSQLSILFIEKPQGGTVKVGEDITFIAKVKA 

EDLSEKPTINGSRKWMDLASKAGKHLQLKETFERHSRVYTFEMQ 

1 1 KAKDNFAGNYRCEVTYKDKFDS CS FDLEVHESTGTTPNIDI R 

SAFKRSGEGQEDAGELDFSGLLKRREVKQQEEEPOVDVWELLKN 

TKP S E YEKIAFQ YES PTCSGMLKRL KRS I RE EKKSAAFAKI LD P 

VYQVDKGGR VRF WELAD P KI»E VKWNKNGQELR P STKY I FEDTR 

CQS I LNIDNCQMTDDSE YYVTAGDEKCSTELLVREPPIMVTKQL 

EDTTD YCGERVELECEVS EDDAQVKWFKNGEE 1 1 LVQTR YRIRV 

EGKKHILIIEGATKADAADYSVMTTGGQSSAKLSVDLKPLKILT 

PLTDQTVNLGKEICLKCErSENIPGKWTKNGLPVQESBRLKWH 

KGRIHKLVIDHAIiTEDEGDYVFAPDAYNVTLPAKVHVIDPPKI I 

LDGLDADNTVTV I AGNKLRLE I P I SGE PP PKAM WS RGDKAI MEG 

SGR I RTES Y PDS S TLVI D I AERDDSGVYH I NLKNEAG EAHAS I K 

VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 

RKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYEVRIFAVNA\I 

GISKPSMPSRPFVPIJVVTSPPTLLTVDSVTDTTVTMRWRPPDHI 

GAAGLDGYVLEYCFEGSTSAKQSDENGEAAYDLPAEDWIVANKD 

LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPILVKE 

1 1 EPP KIHS PKHLKQTY I RR VGDR VI IiVI PFQG KPRPELT WKKD 

GAEI0KNQINIRNSETDTI I FIRKAERSHSGKYDLQVKVDKFVE 

TASIDIRIIDRPGPPQIVKIEDVWGRNVALTWTPPKDDGNAAIT 

GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 

MCGLSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 

VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 

NQGVCTLE I RKPS P YDGGT YCCKAVNDLGTVE I ECKLEVKVI AO 


6359 


274 
1 


1663 
1469 


PRTSR VKTEGSQG S SAMD FS VKVD I EKE VTCP I CL E LLTE PLS L 
DCGHS FCQACITAKI KES VI ISRGESS CPVCQTRFQPGNLRPNR 
HLANI VERVKE VKM S PQEGQ KRD VCEHHGKKLQ I FCKEDGKVI C 
WVCELSQEHQGHQTFRINEWKECX5EKLQVALQRLIKENQEAEK 
LEDDIRQERTAWKNYIQIERQKILKGFNEMRVILDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVQQRQDASTLISDLQRRLRGSSVEM 
LQDVIDVMKRSESWTLKKPKSVSKKLKSVFRVPDLSGMLQVLKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSLNK 
RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTLFMAV\ LPWLGFS 

SLVHVVEFGRCJIEDFPYLFFQLTHCOQRICSVTQAGVQWCDHSS 
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(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, RaArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQPQTPGLNQSSHLSLLSSRDYRMLSSPNEWFWQDRFWLPPNVT 
WTTSLEDRDGRWPHPQDLLAALPLALVLLAMRLAFERFIGLPLS 
RWLGVRDQTRRQVKPNATLEKHFLTEGHRPKEPQLSLLAAQCGL 
TLQQTQRWFRRRRNQDRPQLTKKFCSASWRFLFYLSSFVGGLSV 
LYHESWLV7APVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLELO 
FYLSLLIRLP FDVKRKGGG PS S I KPRPHYDPPSTA\DFKEQVIH 
H FVA V I LMTFS YS ANLLRIGSL VLLLHDS S D YLLEACKM VNYMQ 
YQQVCDALFLIFSFVFFYTRLVLFPTQIIjYTTYYESISNRGPFF 
GYYFPNGLLMLIjQLLHVFWSCIiILRMIjYSFMKKGQMEKDIRSDV 
EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWARE KE MQEF \TRS FF\RGRPDLSTLTHS IVRRRYLAHSGRS 
HLE PEE KQALKRLVE EE PLKMQVDEAASREDKLDLTKKGKRPPT 
PCSDPERKRFRPNSESESGSEASSPDYFGPPAKNGVASRSHTHP 
KEENPRRA\SKAVEBSSDEERQRDLPAQRGEESSEEEEKGYKGK 
TR KKP WKKQAPGKAS VSRKQAR EE SE ES EAEP VQRT AKKVEGN . 
KGTKSLKESEQESEEEILAQKKEQRBEEVEEEEKEEDEEKGDWK 
PRTRSNGRRKSAREERS CKQKSQAKRLLGDS DSEEEQKEAASSG 
DDSGRDREPPVQRJKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 
SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 
ACGAIlRNYKKLIiGSCCSHKERLS ILRAELEAIjGMKGTPSLGKCR 
ALKEQREEAAEVASLDVANI ISGSGRPRRRTAVJNPLGEAAPPGE 
LYRRTLDSDEERPRPAPPDWSHMRGIISSDGESN 


6961 


340 


1646 


RPWSSPTMKPNFSLRLRIF^UJeWGtPYLSKHRADRMRRLGDFL 
NQES FDLALLEEVWS EQDFQYLRQKLS PTYPAAHHFRSG I IGSG 
LCVF3KHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAE YNRQKDI YLAHRVAQAWELAQFIHHTSKK 
ADWIiLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHS PPQQNPSSTHGP \ AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\riL 
LALLCVLAAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVQEVNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6962 


340 


1646 


RPWSSPTMKPNFSLRLRIFNLNCWGIPYLSKHRADRMRRLGDFL 
NQES FDLALLEE VWSEQDFOYLRQKIjS PTYPAAHHFRSG t IGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
S GMVLNAYVTHLHAE YNRQKD I YLAH RVAQAWELAQF I HHTS KK 
ADVVLLCGDLNMHPEDLGCCLIiKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKP FPFGVRID YVLYKAVSGFYI SCKS FET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGIi\LL 
LALLCVLAAGGGAGEAAIL LWTPS VGLVLWAGAFYLFHVQE VNG 
liYRAQAELQHVLGRAREAQDLGPEPQLYALLXLGQQEGDRTKEQ 


6963 


374 


2618 


RVTPLILKLLKKPKTAENQKASEENEITQPGGSSAKPGLPCLNF 
EAVLSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTER1HSIN 
LHNFSNSVLETLNEQRNRGHFCDVTVRIHGSMLRAQRCVLAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQILTAAS ILQI KTVIDECTRIVSQNVGDVFPGIQDSGQDTPRG 
TP ESGTSGQS S DTESG YLQS H PQHS VDR I YSAL YACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQQMBRYL 
STTPE TTHCRKQ PRP VRI QTLVGN IH I KQEMEDD YD YYGQQRVQ 
ILERNESEECTBDTDQAEGTESEPKGESFDSGVSSSIGTEPDSV 
EQQFG PG AARDSQ AE PTQ P EQAAEAPAEGG PQTNQLETGAS S P E 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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Amano acid segment containing signal peptide 
(A=Alanine, OCyateine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T -Threonine, Va Valine, 
WoTryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion/ 
\=possible nucleotide insertion) 








PFLFSLPQPLAGQQTQFVTVSQPGIjSTFTAQLPAPQPLASSAGH 
STASGQGEKKP YECTLCN KT FTAKQNYVKHMFVHTGEKPHQ CS I 
CWRS FS LKDYL I K\HMVTHTGVRAYQCS ICNKRPTQKSSLNVHM 
RLHRGEKSYBCYICKKKFSHKTtiLERHVALHSASNGTPPAGTPP 
GARAG P PGWACTEGTTYVCS VCPAKFDQ I EQ FNDHMRMHVS DG 


6964 


1 


178 1 


SGRPFFFFFSNTDVYFIKKVTNRWTAGSSYKMTRMKSIGKILLL 
QIFIG\NCSMFVLVI 


6965 


757 


208 


NVF IEPRIQGFM KTS AH PGQ KH PD FSMGLLFPLLAALE VCS CGS ' 

SGSLGYNLPQNH\GLLGRNTLVLLGQMRRISPPLCLKDRSDFRP 

PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKAIiL\CCMEHDL 

PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWIEGSTLALRRY 

FQESISTLE 


6966 


820 


1867 


IITAU3VRGMPGCPCPGCGMAGPRLLFLTAIALELLGRAGGSQP 
ALRS RGT ATACRLDNKES E S WGALLSGERLDTW 1 CSLLGSLMVG 
LSG V FP LLV I PLEMGTMLRS EAG AWRL KQLLS FALGGLLGNVFL 
HLLP EAWAYTCS AS PGGEGQS LQQQQQ LGLWVT AG ILTFLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 
GP \ D 1 KVSG YLNLLANTI DNFTHGLAVAAS FL VS KKIGLLTTMA 
I LLI IE I PHE VGDFAI LLRAGFDRWSAAKLQLSTALGGLLGAG PA 
ICTQSPKGVEETAAWVLPFTSGGFLYIALVNVIiPDLLEEEDPW 


6967 


162 


633 


GFLPFKYWILDLSASSRMETDCNPMELSSMSGFEEGSELNGFEG " 
TDM KDMKLE AE AWND VL FAVNNM FVS K5 LRCADD VAY I NVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLQTPYHETVYSLLDTL\ 
S PAYREAFGKR \LLQRLEAIiKRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQKT 
LEQFHLSSMSSLGGPAAPSARWAQEAYKKESAKEAGAAAVPAPV 
PAATEPPPVLHLPAIQPPPPVLPGPFFMPSDRSTERCETVLEGE 
TI SCFWGGEKRLCLPQI LNS VLRDFSLQQINAVCDELH I YCSR 
CTADQLEI LKVMGI LPFS APSCGLITKTDAERLCNALLYGGAYP 
PPCKKELAASLALGLELSERSVRVYHE\CFGKCKGL\LVPEI»YS 
S PS AACI QCLD \ CRLM YP PHKF WHSHKALENRTCHWGF \ DS A\ 
NWRAYILLSQDYTGKEEQARLGR \ CLDDVKEKFD YGNKYKRRVp 
RVSSEPPASIRPKTDDTSSQSPAPSEKDKPSSWLRTLAGSSNKS 
LGCVHPRQRLSAFRPWSPAVSASEKELSPHLPALIRDSFYSYKS 
FETAVAPNVALAPP AQQKWSSPPCAAAVSRAP E PLATCTQPRK 
RKLTVDTPGAP ETLAP VAAP EEDKDS EAEVEVE SREEFTS S LS S 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKE KFLHEWKMRVKQEEKLS AALQAKRS 
LHQE LE FLR VAKKE KLREATEAKRNLR KE I ERJjRAE WE K KM KEA 
NESRLRLKRELEQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QHAEADREQLRADLLREREAREHLE K\ WK\ELQEQLWPRARPE 
AAGSEG\AAELEP 


" 696*9 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRLEREQKLKLYQSATQAVFQKRQA'"" 

GBLDES VLELTSQI LGAN PDFATLWNCRREVLQQLETQKS PEEL 

AALVKAELGPLESCLRVNPKSYGTWHHRCWLLGRLPEPNWTREL 

ELCARFLEVDERNFHCWDYRRPVATQAAVPPAEELAFTDSLITR 

NFSNYSSWHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 

AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSP 

SRPLLVGSRMEILLLMVDDSPLIVEWRTPDGRNRPSHVWLCDLP 

AASLNDQLPQHTFRVIWTAGDVQKECVLLKGRQEGWCRDSTTDE 

QLFRCELSVEKSTVLQSETiESCKELQELEPEMKWCL\LTI I LLM 

RALDPLLYEKETLQYFQTLK\AWDPKRATY\LDDLRSKFLLENS 

VLKMEYAEVRVLHIAHKDLTVLCHLEQLLLVTHLDLSHNRLRTL 

PPALAALRCLEDPPPRT\VLQASDNAIESLDGVTNLPRLQELLL 

OWRLQQPAVLQPIiASCPRLVLLNLQGNPLCQAVGILEQLAELL 

PSVSSVLT 
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Amino acid segment containing signal peptide 
{A»Alanine, C=Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=*Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PnProline, Q=Glutamine, RoArginine, 
S=S erine, T=Threonine , V«- Valine, 
W=Tryptophan, Y=Tyroeine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 


6970 


3 


1528 


SFPPLI^SPSAVGEGKVAVAAPCPGRSECARAKMAYIQLEPLNE " 

GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEU3EVEILGPFPA 

QTP P WLMAS RSSDKDGDS VHTASE VPLTPRTNSPDGRRS SSDTS 

KSTYSLTRRISSLESRRPSSPLIDIKPIEFGVLSAKKEPIQPSV 

LRRTYNPDDYFRKFEPHLYSLDSNSDDVDSLTDEEILSKYQLGM 

IjH FS TQYDLLHNHLTVRVI EARDL P P P I SHDG SRQDMAHSNPY V 

KICLLPDQKNSKQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLL 

LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 

VELGELLLSLNYLPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 

QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 

FTVFGHNMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 

AVEQWHSLRSRAECDRVS PASLEVT 


*97l 


37 


3702 


ACFYVPGSRSFKLIPRHGLVNMGRSGKLPSGVSAKLKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNBLQSGSL 
RLGKSEAPET PMEEEAE LVLTEKS SGTFLSGLS DCTNVTFS KVQ 
RFWESNS AAHKE ICAVLAAVTEVI RSQGGKETETEYFAALIRKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 

EATTTIiHMLTLiIj kdll p cfpeglvks cs etllrvmtls hvlvta 
camqafhslfharpglstlsaelnaqiitalydyvpsendlqpi* 

IAWLKVMEKAHINLVrU.QWDWLGHLPRFFGTAVTCLLSPHSQV 
LTAATQSLKEI LKECVAPHMADIG3 VTSSASGPAQSVAKM FRAV 
EEGLTYKFHAAWSSVLQLLCVFFEACGRQAHPVMRKCLQSIjCDL 
RLSPHFPHTAALDQAVGAAVTSMGPEWLOAVPLEIDGSEETLD 
FPRS WLL PVI RDHVQETRLG F PTT Y FLPLANTLKS KAMDLAQAG 
STVESKIYDTLQWQMWTLLPGFCTRPTDVAISFKGLARTLGMAI 
SERP DLR VTVCQALRTL ITKGCQAEADRAEVSRFAKNFLP I L FN 
LYGQPVAAGDTPAPRRAVLETIRTYLTITDTQLVNSLLEKASEK 
VLDPASSDFTRLSVLDLWALAPCADEAAISKLYSTIRPYLESK 
AHGVQKKAYRVLEEVCASPQGPGALFVQSHLEDLKKTLLDSLRS 

GAR KNAFALLVEMGHAFLR FGSNQEEALQCYLVL I Y PGL VGAVT 
MVSCSIIiALTHLLFEFKGLMGTSTVEQLLENVCLLLASRTRDW 
KSALGFI KVAVTVMDVAHLAKHVQLVMEAIGKLSDDMRRHFRMK 
LRNLFT \ KFI PK \ FG ILTWGKKAVG P KEYHRVLVN IRKAEARAK 
RHRALSQAAVBEEEEEEEEEE PAQGKGDS IEEI IADSEDBEDNE 
EEERS RGKEQR KLARQRSRAWLKEGGGDE PLNFLD PKVAQRVXiA 
TQPGPGRGRKKDHS FKVSADGRLI IREEADGNKMEEEEGAKGED 
EEMADPMEDVI IRNKKHQKLKHQKEAEEEELE I PPQYQAGGSGI 
HRPVAKKAMPGAEYKAKKAKGDVKKKGRPDPYAYI PliNRS KLNR 
RKKMKLQGQFKGbVKAAQR GSQ VGHKNR RKDRRP 


6972 


2179 


973 


PGGAILLPLWRRTRPREATVPRGAAQRGRARSAEGRIPSSQSPS 
PAE AGGATRS PP PR P PRPAR P PGPS APPLLRSDAG PGATVS AAA 
AAATBRARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
AlTIiESPDIKYPLRLIDREIISHDTRRFRFALPSPQHILGLPVG 
QHIYLSARIDGNLWRPYTPISSDDDKGFVDLVIKVYFKDTHPK 
F PAGGKMS Q YLBSMQ IGDT I E FRG P SGLLVYQGKG KFAIRPDKK 
SNPI I RTVKS VGM I AGGTG ITPMLQVI RAIMKDPDDHTVCHLLF 
ANQTEKDILLRPELEELRNKHSARFKLWYTI,DRAPEAWDYGQG\ 
FVNBEMIRDHI*PPPE\EEPLVLMCX3PPPMIQYACI,PNIi\DHVGH 
PTERCFVF 


6973 


1 


1964 


LQ PRCAH RGLRAQKCGR PAPGVDAMVLC P VI GKLLH KR WLAS A 
S PRRQE I LS NAGLR FE WPS KFKEKLDKAS FATP YG YAMETAKQ 
KALEVANRLYQKDLRAPDWIGADTIVTVGGLILEKPVDKQDAY 
RMLSRFE /SGREHS VFTGVAI VHCSSKDHQLDTRVS E FYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIO^LGGMLVESVHGDFl. 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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(A-Alanine, C-Cysteine, D=Aspartic Acid, E* . 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
Pa Proline, Q=Glut amine, R=Arginine, 
S»Serine, T= Threonine, V«Valine, 
W= Tryptophan, Y=Tyroeine, X=Unknown, +«Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDVEGGGSEPTQRDAGSRDEKAEAGEAGQATAEAKCHRTRETLP 
P FPTRLLELIEGFMLS KGLLTACKLKVFDLLKDEAPQKAAD I AS 
KVDASACGMERLLD I CAAMGLLEKTEQGYSNTETANVYLASDGE 
YS LHG P I MHNNDLTWNLFTYLE FAI REGTNQHHRALG KKAEDL F 
QDAY YQS PETRLR FMRAMKGMTKLTACQVATAFNLS R FS S ACDV 
GGCTGALARELARE YPRMQVTVFDLPDI I ELAAHFQPPGPQAVQ 
IHFAAGDFFRDPLPSAELYVLCRILHDWPDDKVHKLLSRVAESC 
KPGAGLLLVETLLDEEKRVAQRALMQSLNMLVQTEGKERSLGEY 
QCLLELKG FHQ VQ WHLGG VLDA I L\ P PKW P PEAQ AACS L 


6974 


3082 


2172 


RSCAAFASFASRPPtELFA^PGSHRSPPGRGVATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSIiPTSAPLSV 
SLPTN I VP PTTIWTS S PQNTDADTAS PSNGTHNNS VLPVTASAP 
TSLLPKNISIESREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH 
IjTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPOAPASSPSSI, 
S TS P PEVFS AS VTTNH S STVTS TQPTGAPTAP ES P TEE S S SDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


R PR PTVH CCKWAI»KI*E TAMETL INVFHAHS G KEGDKYKI/S KKEL 
KELLQTELSGFLDVKELML*ATEALKTFEEA* KSPI IQ CSSS RS 
SLPPAPQPPPYL*LSAVPFPIHLPLPLLPPQAQKDVDAVDKVMK 
ELDENGDGEVDFQEYWLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQIj*VAYGTTENSPVTFAHFPEDTVEQKAESVGRIMPHTEARI 
MNMEAGTLAKLNTPGELCIRGYCVMLGYWGEPQKTEEAVDQDKW 
YWTGD VATMNEQGFCK I VGR SKDMI IRGGEN I YPAELEDFFHTH 
P KVQ EVQ WG VKDDRMGEEI CAC I RLKDGEETTVEE I KA PC KGK 

ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL+IKQQ 
ACPGRLA 


6977 


1298 


588 


S IiF I NTNLLS NQ I RKTS FGMCS E P I S DNTEDQ KGKL KTPDFA * R 
ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVliKG 
VYS TQVG FAGG YTSNPTYKEVCS EKTGHAEVVR VVYQPEHMS FE 
ELLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KENYQKVLSEHGFGPITTDIREGQTFYYAEDYHQQYLSKNPNGY 
CGLGGTGVS CP VGIKK 


$978 




242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKLSKEAKQRLQQLFKGSQ 
FAIRWGFIPLVIYLGFKRGADPGMPEPTVLSLLWG 


6979 
" 6"980 


3917 
1 


1146 
420 


DEARVRGEAVAAAIIiSRCRHWSGPPPFPPSPPDRKGLRGTEPWE 

AGPGSGATPGARAMDVRRLKVNELRBELQRRGLDTRGLKTELAE 

RLQAALEAE EP DD ERELDADDE PGRPGHINEEVET EGGS E LEGT 

AQPPPPGLQPHAEPGGYSGPDGHYAMDNITRQWQFYDTQVIKQE 

NESGYERRPLEMEQQQAYRPEMKTEMKC2GAPTSFLPPEASQIiKP 

DRQQFQSRKRPYEENRGRGYFEHREDRRGRSPQPPAEEDEDDFD 

DTLVAIDTYNCDLHFKVARDRSSGYPLTIBGFAYLWSGARASYG 

VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 

GEEPFSYGYGGTGKKSTNSRFENYGDKFAENDVIGCFADFECGN 

DVELS FTKNGKWMGIAFR I QKEALGGQALYPHVL VXNCAVEFNF 

GQRAEPYCSVLPGFTFIQHLPLSERIRGTVGPKSKAECBILMMV 

GLPAAGKTTWAIKHAASNPSKKYNILGTNAIMDKMRVMGLRRQR 

NYAGRWDVLIQQATQCLNRLIQIAARKKRNYILDQTNVYGSAQR 

RKMRPFEGFQRKAIVICPTDEDLKDRTIKRTDEEGKDVPDHAVL 

EM KAN FTIjPDVGD FLDE VL F I ELQRE E ADKLVRQ YNE EGR KAGP 

PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 

GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 

PQQQPPPQQPPPPQPPPQQPPpppsYSPARNPPGASTYNKNSNI 

PGS S ANTSTPTVS S YS PPQS FGFFPS TFQ PS YSQP P YNQGG YS Q 

G YTAP P P PP PP P PAYN YG S YGGYNPAP YT PPP PPTAQTYP Q P S Y 

NQYQQYAQQWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTSTQ 

GTRGRKTGRVAAPSTRRRTGNMQKLQTRSPAMSLSDPGLGYHPT 
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W^Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\«possible nucleotide insertion) 








CWTLR W P PliCSbHALHVFHCLFSS RLGTP VS PRLAMD P Wc3 C EA 

GGSCACAGSCKCKKCXCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6581 


10 


1054 


PGRGFRRA& LR PA FAARG VFQGGLGQAKQ ARTRACAALPTPH P S 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLSIFSQEYQKHIKRTHAKHHTSBAI 
ESYYQRYLNGWKNGAAPVLLDIANEVDYAPSLMARLILERFXQ 
EHEETPPS KS I INSMLRDPSQI PDGVLANQVYQCI VNDCCYGPL 
VDCIKHAIGHEHBVLLRDLLLEKNLSFLDEDQLRAKGYDKTPDF 
ILQVPVAVEGHI I HW I ES KASFGDECSHHAYLHDQFWS YWNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTN1VTLCHSIA 


6982 


153 


12B5 


FPQQDCS APAAPGLAG SE PRRLRAYRRRRQRARGLKRVAW LAP P "" 
PSLLC2GLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLKPVIKAFLCGSISGTCSTLLFQPLDLLKTRLQTLQ 
PSDHGS RRVGMLAVLLKWRTBS LLGL WKGMSPSIVRCVPGVG1 
YFGTLYSLKQY FLRGHPPTALES VMLGVGSRS VAGVCMS PITVI 
KTRYESGKYGYES I YAALRSI YHSEGHRGLFSGLTATLLRDAP F 
SGI YLM PYNQTKNI VPHDQVDATLI PITNFSCGI PAG ILAS LVT 
QPADVI KTHMQLYPLKFQWIGQAVTL1 PKD YGLRGFFQGG I PRA 
LRRTLMAAMAWTVYEEMMAKMGLK8 


6983 


82 


773 


BMSFXKJDPSFFTMGMWSIGAGAIiGAAALAIiLLANTDVFLSKPQk 
AALEYLEDIDLKTLEKEPRTFKAKELWEKNGAVTMAVRRPGCFL 
CRE EAADLS S L KSMLDQLG VPL YAWKEH IRTE VKDFQP YFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 

GEGFILGGVFWGSGXQGILLBHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


GGRS AYSLPAGS LPRVPATAAAKMASGVQVADEVCRI FYDMKVR 
KCSTPEEIKKRKKAVI FCLSADKKCI IVEEGKE I LVGDVGVTIT 
DPFKHFVGMLPEKDCRYALYDASFETKESRKEELMFFLWAPEIA 
PLICS KM I YAS S KDAI K KKFQGI KHE CQANGPEDLNRAC I AE KLG 
GSLIVAFEGCPV 


6985 


1887 


1324 


RRTAGIYPC^PKPGRTRHALCSWLLLLTGQLAFDDFQESCAMM 
WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 
YKLAVEQIiQSHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLELKDGQQIPVFK 
LSGENGDEVKKE 


6986 


642 


1350 


YHLYFKMGDPNSRKKQALNRLRAQLRKKKEStADQFDFKMYiEAF 
VFKEKKKKSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSL 
ELLQKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIEKIVC 
LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 

VNNPNQSVFXF1DRQHLQTPKNKATIFKLCSICLYLPQEQLTHW 
AVGTIEDHLRPYMPE 


6987 
" 6988-" 


1623 


341 


bKAAfiKASRAFKESQRQTDSKNYETENWSPQKSQRRYDMYNTAC-" 

FI^EIEVGLYTIQILQLTPFFHKENELSKKHMVQFLSGKWTIPP 

DPRNECYLALSKFTSHLKNLQSDLKRCFDFFIDYMVLLKMRYTQ 

KEIABIMLSKKVSRCFRKYTELFCHUDPCLLQSKESQLLQEENC 

KJUUjKAL>RAURFAGLLEYLNPNYKDATTMESIVNE 

KKPMTNEKQNSILANIILSCLKPNSKXiIQPIiTTLKKQIiREVLQF 

VGLSHQYPGPYFLACtLFWPENQELDQDSKLIEKYVSSLNRSFR 

GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 

SLWHSGDVWKKNE VKDULRRLTGQAEGKL1 S VE YGTEE KI KI P V 

I S VYS G PLRS GRN I ERVS FYLGFS I EG P PGL 




3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQ DAQTGPQ P M P RADC IMRHLP YFCRGQWRG 
FGRGS KQLG I PTAN F P EQ WDNLPAD ISTG I YYGWAS VGSGDVH 
KMWSIGWNPYYKNTKKSMBTHIMHTFKEDFYGEILNVAIVGYL 
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HaHistidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VcValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDIEEAKKRIiELPEHLKIKEDNFFQVS ~ 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPLSPSTHASAGSHCHAPPTTARRAFPIPFGSKSNMAT"ir~ 

KDQLIYNLLKEEQTPQNKITWGVGAVGMACAISILMKDLADEL 

ALVDVIEDKLKGEMMDLQHGSLFLRTPKIVSGKDYNVTANSKLV 

1 1 TAGAR QQEGESRLNLVQRNVNI FKFI I PNWKYS PNCKLIiI V 

SNPVDILTYVAWKISGFPKKRVIGSGCNLDSARFRYLMGERLGV 

HPI^CHG^LGEHGDSSVPVWSGMNVAGVSLKTLHPDLGTDKDK 

EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 

R VHP VS TM I KGL YG I K DD VFLS VP C I LGQNGI SDL VKVTLTS EE 

EARLKKSADTLWGIQKELQF 


6990 


719 


258 


THASGMAS WLALRTRTAVTSLLS PTPATALAVRYASKKSGGSS 
KNLGGKSSGRRQGIKKMEGHYVHAGNI IATQRHFRWHPGAHVGV 
GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 
KTFVHVVPAKPEGTFKLVAML 


6991 
■§992 " 


169 


451 


RRSSDFHNPGFI^RPVSLRBNliiHQVICSTKNkRRNPKKlXYTL - 
S S LLMTNLN PNES TENQP VDAYWAFTLDQE FLT YACVEGTGCL F 
CGRHVH 




944 


510 


RQAPGCSSLALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGALY 
RSPMNQENPPPYPGPGPTAPYPPYPPQPMGPGPMGGPYPPPQGY 
P YQG YPQ YGWQGG PQE P PKTTVYWEDQRRDE LG PST CLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


OWCVTCPQHNARQGPAVPPGIQAYGAAP^tQVDFTEMSKCRG 
DRVWIKNWNVASLCPLWKGPQTWLSPPTAVXVEGIPAWIHHSH 
VKPAARETWEARPSPDNPFRVTLKKTTSPAPVTPGS 


6994 ■ 
£qqtj 


346 


1100 


QWPEKDPVMAASSISSPWGKHVFKAILMVLVAJbll^HSALAQSR 
RDFAPPGQQKREAPVDVLTQIGRSVRGTLDAWIGPETMHLVSES 
SSQVLWAISSAISVAFFAliSGlAAQLLNALGliAGDYIAQGLKLS 
PGQVQTFLLWGAGALWYWLLSLLLGLVLALLGRI lwglklvi f 
LAGFVALMRS VPDPSTRALLLLALLI LYALLSRLTGSRASGAQL 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 




144 


1346 


GSVAVGLSGIMAAQKDLWDAIVIGAGIQGCFTAYHLAKHRKRIIi ' 

LLEQFFLPHSRGSSHGQSRIIRKAYLEDFYTRMMHECTQIWAQL 

BHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 

EELKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRALQDArRQLG 

G I VRDGEKWE I N PGLLVTVKTT5RS YQAKS LVTTAGPWTNQLL 

RPLGIEMPLQTLRINVCYWREMVPGSYGVSQAFPCFLWLGLCPH 

HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 

SSFVRDHLPDLKPEPAVIESCMYTNTPDEQP1LDRHPKYDNIVI 

GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KAHL 


5996 


543 ~~" 


1942 


ETANAEAAARKSAMDWKEVLRRRLATPNTCPNKKKSEQELKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQICLR 
EESRAVFLQRKSRELLDNEELQNLWFLLDKHQTPPMIGEEAMIN 

YENFLKVGEKAGAKCKQFFTAKVFAKLLHTDSYGRISIMQFFNY 
VMRKVWLHQTRIGLSLYDVAGOGYTiRECinT.PTJVTT t?t tdtt ™-%t 

DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACSFLDD 
LLELRDEELSKESQETNWFSAPSALRVYGQYLNLDKDHNGMLSK 
EELS RYGTATMTNVFLDR VFQECLT YDG EMD Y KT YLDF VLALEN 
RKBPAALQYIFKIiLDIENKGYLNVFSLNYFFRAIQELMKIHGQD 
PVSFQDVKDEIFDMVKPKDPLKISLQDLINSNQGDTVTT1LIDL 
NGFWTYENREALVANDSENSADLDDT 


6997 


370 


1104 


AMELTIFILKLAIYILTFPliYliLNFLGLWSWICKKWFPYFIjVRF " 

TVIYNEQMASKKRELFSNLQEFAGPSGKLSLLEVGCGTGANFKF 

YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFWAAGENMH 
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L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=VaIine, 
{■/-Tryptophan, Y^Tyrosine, X= Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=poseible nucleotide insertion) 








QVADGSVDWVCTLVLCSVKNQERIIjREVCRVLRPGGAFYFMEH 
VAAECSTWNYFWQQVLDPAWHI*IjPDGCNLTRESWKAJuERASFSK 
LKLQHIQAPLSWELVRPHIYGYAVK 


6998 


2 


616 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIECPGA'TNCPEPLW<j 
SHLPVPYAPPTMESRGKS AS S PKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKQQPPAAPTTAPAKKTSAKADPALLNNHSN 
LKPAPTVPSSPDATPEPKGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPIiLVAGGVAVAAIALILGVAFLVRKK ' 


6999 


14 


1591 


GRAGACSRRDTAMSIEIESSDVIRLIMQYLKENSLHRALATLQE 
ETTVS LNTVDS I E S FVAD I NSGH WDT VLQA IQS LKLPDKT L I DL 
YEQWLELIELRELGAARSLIiRQTDPMIMLKQTQPERYIHLENL 
LARSYFDPREAYPDGSSKEKRRAAIAQAIAGBVSVVPPSRLMAL 
LGQALKWQQHQG LL PPGMT I DLFRGKAAVKD VE EE KFPTQ LS RH 
IKTOQKSHVECARFSPDGQYLVTGSVDGFIEWNFTTGKIRKDL 
KYQAQDNFMMMDDAVLCMCFSRDTEMLATGAQDGKIKVWKIQSG 
QCLRRFERAHSKGVTCLSFSKDSSQILSASFDQTIRIHGLKSGK 
TLKE FRGHSS F VN EATFTQDGH Y IIS AS S DGT VKI WNMKTTECS 
NTFKSI^STAGTDITVNSVILLPKNPEHFWCNRSOT*VVIIWMQ 
GQIVRSFSSGKREGGDFVCCALSPRGEWIYCVGEDFVLYCFSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGVVFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLLQPALTGDVEGLQKI FEDPENPHHEQAMQLLLEED I VGRN 
LLYAACMAGQS DVT RALAKYG VNIiNE KTTRGYTLLHCAAAWGRL 
ETLKALVELDVDIEALNFREERARDVAARYSQTECVEFLDWADA 
RLTLKKYIAKVSLAVTDTEKGSGKLLKEDKNTILSACRAKNEWL 
ETHTEASINELFEQRQQLEDIVTPIFTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCIi 1 1 AFLKGCF IFIYFIFI FETEFLS CCPG WS AVAQS RL I AN 
FASQVQAI FILPKDSQVGPD VKSEAAPKRALYBSVFGSGE I CGP 
TSPKRLCIRPSEPVDAVWVSVKHDPLPLLPEANGHRSTNSPTI 
VSPAIVSPTQDSRPNMSRPLITRSPASPLNNQGIPTPAQLTKSN 
AP VH ID VGGHM YTS S LATLTKYPES R IGRL FDGTEP I VLDSLKQ 
HYFI DRDGQM FRY I LNFLRTS KLLI PDD FKDYTLL YEE AKY FQL 
QPMLLE ME RW KQDRE TGRFS R PCECL WRVAPDLGERI TLSGDK 
SLIEEVFPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVLRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


498 


PMPSS TRWTTS * TYTDTS S A WACRPTTG TCT* TAAPG PT VR W WP 
T PCS RHQS RRRLTCWCSTSR PCGR * GGLCVRTAP TRP TTS AS SS 
SWTSAGTSWPAGRRTGrATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAF CWQRDFLQP PGMRLS ALLALASKVTL P PH YRYGMS PP 
GS VADKRKNP P W I RRR P WVE P I S DEDW YLFCGDTVE I LEGKDA 
GKQGKWQVIRQRNWWVGGLNTHYRYIGKTMDYRGTMIPSEAP 
LLHRQVKLVDPMDRKPTEIEWRFTEAGERVRVSTRSGRIIPKPE 
FPRAIXSIVPETWlDGPKDTSVEDALERTYVPCLKTliQEEVMEAM 
G I KETR \NTRRS IG I E PGAEQLL PNFCP S LEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\ PKRTL KTQLG/ Y YCRVRP LGF PDQE CC IE V INNTTVQLHTP E 
GYRLNRNGDYKETQYS FKQVFGTHTTQKELFDWANPLVNDL I H 
GKNGLLFTYGVTGSGKTHTMTGS PGEGGLLPRCLDMI FNS IGS F 
QAKRYVFKSNDRNSMDIQCEVDALLERQKREAMPNPKTSSSKRQ 
VDPEFADMITVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLL 
EEVPFDPINPNLHNLNCFVKIKNHNMYVAGCTEVEVKSTEEAFE 
VFWRGQKKRRI ANTHLNRBSSRSHS VFNI KL VQAP LDADGDNVL 
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W -Tryptophan, Y -Tyrosine, X-»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEKEQlTISQLSLVDLAGSERTNRTRAEGNRIiRSAGNINQSLMT " 
LRTCMDVLRENQMYGTNKPTVPyRDSKLTHLPKNyFDGEGKVRMI 
VCVNPKAED YEENLQVMRFAE VTQEVEVARPVDKAI CGLTPGRR 
YRNQPRGP\IGNEPLVTDVVLQSFPPLPSCEILDINDEQTLPRL 
IEALEKRHNLRQMMIDEFNKQSNAFKALLQEPDNAVLSKENHMO 
GKLiNEKEKMI SGQKLB IERLEKKNKTLEYKI EIIjE KTTTI YEED 
KRNLQQELETQNQKLQRQFSDKRRIiEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRBR 
DRE KVTQRSVS PS PVPVSYL 


7005 


63 


876 


RNMALYQRWRCLRI^GL^CRLHTAWSTPPRWLAteRLGLFEEt 
WAAQVKRLASMAQKEPRTIKISI»PGGQKIDAVAWNTTPYQLiARQ 
I S S TIiADTA VAAQVNGEP YDLERP LETDSDLR FLTFDS PEGKAV 
FWHSSTHVLGAAAEQFIiGAVLCRGPSTE YGFYHDFFLG KERTI R 
GSELPVLERICQELTAAARPFRRLEASRDQLRQLFKDNPFKLHL 
IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGQIGGLKLLSNSSS 
LWRSSG 


700(5 w 


22 


898 


NAFGRHST AVKMAAAAWLQ VLP VI LLLLGAkPS PL5 FFS AGP AT 
VAAADRSKWHIPIPSGKNYFSFGKILFRNTTIFLKFDGEPCDLS 
LNITWYLKSADCYNEIYNFKAEEVELYLEKLKEKRGLSGKYQTS 
S KLPQNCS ELFKTQTFSGDFMHRLPLLGE KQEAKENGTNLTFIG 
DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKENSLSNriFTMT 
VEVKGPYEYLTLEDYPliMIFFMVMCIVYVLFGVLWIiAWSACYWR 
DLLRIQFWIGAVI FLGMLEKAVFYAGFQ 


700?" 


2 


1001 


AMTVSGPGTPEPRPATPGASSVfeOLfekE6NELFKCGDYGGALAA 
YTQALGLDATPQDQAVLHRNRAACHLKLEDYDKAETEASKAIEK 
DGGDVKALYRRSQALEKLGRLDQAVLDLQRCVSLEPKNKVFQEA 
LRNIGGQIQEKVRYMSSTDAKVEQMFQILLDPEEKGTEKKQKAS 
QNLWLAREDAGAEKI FRSNGVQLLQRLLDMGE'TDLMLAALRTL 
VGI CSEHQSRTVATL5 ILGTRRWS ILGVESQAVSLAACHLLQV 
MFDALKEGVKKQFRGKEGAIIVGEWKQVWGLLDVTVMEGMGLSQ 
PGQFFGDQTCSCRLFGIRFGDI ILL 


7006 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWLGRPPSGLPPGPR " 
SPPPLAGPGQKMVQKKPAELQGFHRSFKGQNPFELAFSLDQPDH 
GDS D FGLQ CS ARP DM PASQP I D I PDAKKRG KKKKRGRATDS FSG 
RFEDVYQLQEDVLGEGAHARVQTCINLI TSQE YAVKI I EKQPGH 
I RS RVFRE VEMLYQCQGHRNVLELI E F FEE EDRFYLVFE KMRGG 
S I LSH IHKRRHFNELE AS WVQDVASALDFLHNKG IAHRDLKPE 
N I LCEH PNQVS P VK ICD FDLGSG I KLNGDCS P I STPELLTPCGS 
AEYMAPEWEAFSEEASIYDKRCDLWSLGVILyiLLSGYPPFVG 
RCGS DCGWDRGEACPACQNMLFES I QEGKYEFPDKDWAHI SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WD SHFLLP PH PCRI HVRPGGL VRTVTVNE 


• 7009 


1 


626 


ARQLRNSWVDDFVAAPLI PLSQQI PTGNSLYESYYKQVDPAYTG 
RVGASEAALFLKKSGLSDIILGKIWDLADPEGKGFIiDKQGFYVA 
LRLVACAQSGHEVTLSNLNLSMP PP KFHDTSS PLMVTPPSAEAH 
WAVRVEEKAKFDG I FESLLPINGLLSGDKVKPVLMNS KLPLDVL 
GR VWDLSD I DKDGHLVRDE FAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLS PLCPLLGGGTAMSGGEQKPER YYVG VDVGT " 
GSVRAALVDQSGVLLAFADQPIKNWEPQFNHHEQSSEDIWAACC 
WTKKWQGIDLNQIRGLGFDATCSLWLDKQFHPLPVNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


R I QTL PNQNQ SQTQP LLXTP PAVLQPIAP^TTFbVQTQPQPQS L 
LQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRLD P PS R FSGRNDRGDQ VPNRKDDRS RERERERRRSRERS PQ 
RKRS RERS P RR ERERS PRRVRR VVPRYTVQ FS KFS LDC PSCDMM 
ELRRRYQNLYIPSDFFDAQFTWVDAFPLSRPFQLGNYCNFYVMH 



585 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, OCyeteine, D-Aspartic Acid, E=» 
Glutamic Acid, Phenylalanine, G*Glycine, 
HsHistidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*=Asparagine, 
P=Proline, G=Glutamine, R«Arginine, 
S==Serine, T=Threonine, V=Valine, 
W«=Tryptophan, ^Tyrosine, X«Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\opossable nucleotide insertion) 








REVESLEKNMAILDPPDADHLYSAKVMLMASPSMEDLYHKSCAL 
AEDPQBLRDGFQHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 
DPEKDP5VLIKT\AIRCCKALTG 


7012 
~~7013 


1 


2661 


rragsvkrgkarlfgpterqseRplrpsaarrpe^sgkkaaaa 

AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAGIjSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVBYREMDESLAI*7IiSEDEyySEEER 
NAKAEKEKKLPPPPPQAPPEEENESBPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRKRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGWnVTLLEARDRVGGRV 
ATFRKGNYVADLGAM WTGLGGNP MAWS KQVNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSyLSHQLDFNVLNNKP 
VSLGQALEWXQLQEKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NTjKE K I KELHQQ YKEAS EVKP PRD I TAE FLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLSLKHWDQDDDFBFTGSHLTVRNGYSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFIYKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
F WDPS VNLFGHVGS TTASRGEL PLF WNLYKAP I LLALVAGEAAG 
IMENISDDVIVGRCLAILKGIPGSSAVPQPKETWSRWRADPWA 
RGSYSYVAAGSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 




1 


2661 


RRAgsvkrgearlfgpterGSerplrpsaarrpemlsgkkaaaa 

AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAGLSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDE3liANLSEDEYYSEEER 
NAKAE KEKKLP PPP PQAP PEEENES E P EE PSG VEGAAFQS RL PH 
DRMTS QEAAC F PDI I SG PQQTQ KVFLF I RNRTLQL WLDN PKIQL 
TFEATLQQ LE AP YNSDT VLVHRVHS YLERHGL INFGIYKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQS FGMDVTLLEARDRVGGRV 
ATFRKGNYVADLGAMVVTGLGGNPMAVVSKQVNMELAKIKQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIEHWKKIVKTQBEJjKELLNKMV 
NLKE K I KELHQQ YKEAS E VKPPRDI TAE FLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATP LSTLS LKHWDQDBDFEFTGSHLTVRNG YS CVP VALAEG 
LDI KLNTAVROVR YTAS GCE V I AVNTR S TSQTP I YKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPSVNLFGHVGSTTASRGELFLFWNLYKAP I LLALVAGEAAG 
IMENI SDDVI VGRCLAI LKGI FGSSAVPQPKETWSRWRADPWA 
RGSYS YVAAGSSGND YDLMAQPITPGPS I PGAPQPI PRLFFAGE 

HTIP>NYPATVHGAIjLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQOSPSM 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETMALPQEGSLARIPETSLDCLENTLGVEEQRHETSDHEAEEPD 
CIISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVBWEM 
PLATDSPTSDPTEWNGISSQPQVPFHPNLQKSQYYSTVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDSKLTQQLIEFEKSLAGPGTEP 
DKILRHFS IMDFNSEKDI VRGSS KLITEQELPERRKALRPPPPR 
PCTPVSTSPKLLVDQNLKPAPPLWRPSRPAPLPPSAQQRTNAV 
S PKLLSRHRPTCE TLE KEGPGHMGRS LDQTS PCPLVLVRI EEM E 
RDLDMYSRAQEELNLMLEEKQDES SRAETLEDLKFCESNIESLN * 
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L=>Leucine, M=Methionine , N^Asparagine, 
P* Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *-Stop 
Codon, /*=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTERDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQMV 
I KVS KQLLAALE I S DAVG P VFLGHR DELBGT YKI YCQNHDEAI A 
LLE I YEKDE KI QKHLQDS LADLKS LYNBWGCTNY INLGS FLI KP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKE1NVNINE 
YKRRKDLVLKYRKGDEDS LME KI S KLN IHS I IKKSNRVS SHLKH 
LTGFAPQIKDEVFEETEKNFRMQERLIKSFIRDLSLYLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 

DKKTLEELQSARNNYEALNAQljLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQALEQLKPLLS LLKVAGREGNL I AI FHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSB 
E LRAS LLAR Y P PE KLFQ AERNFNAAQDLDVS LLEGDLVG V I KKK 
DPMGSQNRWLIDNGVTKGFVYSSFLKPYNPRRSHSDASVGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SAD VARDVKQPTATPRS YRNFRHP E I VGYS VPGRNGQSQDLVKG 
CARTAQAPEDRS TE PDG S EAEGNQ VY FAVYT FKARNPN ELS VSA 
NQKL KI IiE FKD VTGNTE WWLAE VNG KKGYVP SN Y I RKTE YT 


7015 


1842 


513 


GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSRVLRWLLGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQALGVITEKETQVILLDTP 
GI I S PGKQKRHHLE LSLLEDP WKSMES ADL VVVLVD VS D KWTRN 
QLSPQLLRCLTKYSQIPSVLVMNKVDdjKQKSVLIjEIjTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FMLSALSQEDVKTLKQYLLTQAQPGPWE YHSAVLTSQTPE 
EI CANI IREKLLEHLPQEVPYNVQQKTAVWEEG PGGELVIQQKL 
LVP KES YVKLL IGPKGHVI SQI AQEAGHDLMDI FLCDVDIRLSV 
KLLK 


701* 


167 


2513 


ILNAPKPPPPRDSVEAVAAKRDTGGGSWGTGMDVSGQETDWRST 
AFRQKLVSQIEDAMRKAGVAHSKSSKDMESHVFLKAKTRDEYtiS 
LVARLI I K FRD I HN KKS OAS VS D PMNALQS LTGG PAAGAAG IGM 
P PRGPGQSLGGMG S LG AMGQ PMS LSGQ P P PGTS G MAPHS MAWS 
TATPQTQLQLQQVAAAAAAATARSSSS SSRRRYSSSSSS SNSKQ 
FQAQQSAMQQ\QFQA\ WQQQQQL\QQQOQQQQHL I KLHHQNQQ 
QXQQQQQQLQRIAQLQLQQQQQQQQQQQQQQQQALQAQPPIQQP 
PMOOPO P P PSOAL POOLOOMHHTOHHO P PPOPDO P P VADNOP <? n 
LPPQSQTQPLVSOAQALPGQMLYTQPPLKFVRAPMVVQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQS S LPMLSS PS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QSPVTARTPQNFS VPS PGPLNTP VNPS S VMSPAGSSQAEEQQ Y 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
SKRCPLKTLQKCEIALEKLKNDMAVPTPPPPPVPPTKQQYLCQP 
LLDAVLANIRSPVFNHSLYRTFVPAMTAIHGPPITAPWCTRKR 
RLEDDERQSIPSVLQGEVARLDPKFLVNLDPSHCSNNGTVHLIC 
KLDDKDLPSVPPLELSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HRCWTSRLLQLPDKHS VTALLNTWAQS VHQACLSAA 


7017 


1 


1785 


INLGNTCYMNSVI*AI*FMATDFRRQVLSLNl*NGCNSLMKKL<iHl» 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFLXiDR 
LHEEEK I LKVQASH KPSE I LE CS ETSLQE VAS KAAVLTETPRTS 
DGEKTLIEKMFGGKI^TOIRCLNCRSTSQKAEAFTDLSLAFWPS 
YSLEYMS CPDCS QS PS I QDGGLMQASVPG PS E E PWYNPTTAAF 
ICDSLVNEKTIGSPPNEFYCSENTSVPNESNKILVNKDVPQKPG 
GETTPSVTDLLNYFLAPEILTGDNQYYCENCASIiQNAEKTMQIT 
EE PEYL I LTLLR FS YDQ K YHVRRKI LDNVSL PL VLEL P VKRITS 
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Amino acid segment containing signal peptide ' 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, • 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T- Threonine, V-Valine, 
W=Tryptophan, Y*Tyrosine, X- Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








FSSLSESV7SVDVDFTDLSENLAKKLKPSGTDEASCTKLVPYLLS 
S WVHSG I S SESGH Y YS YARN I TSTDSS YQM YHQS E ALALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWPLFNDSRVTPTS PQS VQK 
I TSR FP KDTA Y VLLYJCKQHSTNGLSGNNPTSGLW I NGDP PLQKE 
LMDAI T KDNKI* YLQEQELN ARARALQAASASCS FRPNG FDDND P 
PGSCGPTGGGGGGGFNTVGRLVF 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTER1RAPEIIFQ 
PSLIGEEQAGIAETLQYILDRYPKDVQEMLVQNVFLTGGNTMYP 
GMKARME KELLEMR P FRSS FQVQLASNP VLDAWYGAR DWALNHL 

DDNEVWITRKEYEEKGGEY^KEHCASNIYVPIRLPKQASRSSDA 
QASS KGSAAGGGGAGEQA 


7019 


1048 


335 


APGGFLVTMVFPAPS P PWMLGCCSHEVTAGPPTLCKDMSALVAA 
RMRH I P LAPGS DWRD L PNIEVRLS DGTMARKLRYTHHDRKNGRS 
SSGALRGVCSCVEAGKACDPAARQFNTLIPWCLPHTGNRHNHWA 
GliYGRLEWDGFFSTTVTNPEPMGKQGRVliHPEQHRWSVRECAR 
SQGFPDTYRLFGNILDKHRQVGNAVPPPLAKAIGLEIKLCMLAK 
ARESAS AKI KEEEAAKD 


7020 


1 " 


2154 


FADS KRKS VLLDKI KNLQVALTS KQQS LETAMSFVARNTFKRVR 
NGFLMRKVAVFFSNTPTRASPQLREAVLKLSDAGITPLFLTRQE 
DRQL I NALQ INNTAVGHALVLP AGRDLTDFLENVLTCH VCLD I C 
NIDPSCGFGSWRPSFRDRRAAGSDVDIDMAFILDSAETTTLFQF 
MKKYIAYLVRQLDMS PDPKASQH FARVA WQHAPS ES VDNAS 
MPPVKVEFSLTDYGSKRKLVDFLSRGMTQLQGTRALGSAIEYTI 
ENVFESAPNPRDLKIWLMLTGEVPEQQLEEAORVILOAKCKGY 
FFWLGIGRKWIKEVYTFASEPNDVFFKL7DKSTELNEEPLMR 
FGRLLPSFVSSENAFYLSPDIRKQCDWFQGDQPTKNIiVKFGHKQ 
VNVPNNVTSS PTSNPVTTTKP VTTTKPVTTTTKP VTTTTKPVTI 
INQ PS VKPAAAKPAPAKPVAAKP VATKTATVRPP VA VKP ATAAK 
P VAAKPAAVR P PAAAAAKP VATKP EVPRPQAAKPAATKP ATTKP 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLLAGQTYHVAWCYLRSQVRATYHGSFS 
TKKSQPPPPQPARSASSSTINLMVSTEPLALTETDICKLPKDEG 

TCRDFILKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEKVCA 
PVLAKPGVISVMGT 


7021 
" 7022 


2 


336 


VNAVSFFPNGYAFATGSDDATCRIiFDLRADQELLLYSHDNI ICG 
ITSVAFSKSGRLLLAGYDDFNO^VWDTLKGDRAGVLAGHDNRVS 
CLGVTDDGMAVATGS WDS FLRIWN 




2 


" 856 


vyigsfwshpllipdwrklfeaeeqdlfrdIqslprnaaLrkln 

DLIKRARLAKVHAYIISSLKKEMPSVFGKDNKKKELVNNLAEIY 

grierehqispgdfpnlkrmqdqlqaqdfskfoplkskllewd 
dmlahdiaqlmvlvrqeesqrpiqmvkggafegtlhgpfghgyg 

EGAGEGIDZ3ABWVVARDKPMYDEIFYTIiSPVDGKITGANAKKEM 
VRSKL PNS VLGKI W KLAD I DKDGM LDDDE FAIiANHL 1 KVKLEGH 
ELPNELPAHLLPPSKRKVAE 


7023 
' 7024 


2 


748 


AMVFGGWPYVPQYRDIRRTQNADGFSTYVCLVLLVANIIiRILF 
WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADSKDBEVKVAPRRS FLDFDPHH FWQWSS FSDYVQCVLAFTG 
VAGYI TrLSIDSALFVETI^FIiAVLTEAMIjGVPQL YRNHRHQST 
EGMS I KMVLMWTSGDAFKTAYFLLKGAPLQFS VCGLLQVLVDIA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 




1207 


190 


RTGVTGWAQVWMFGGGGVLsSGEQiiQMPVKPERGLGPSDGMLV 
SSRRGSPGTVLGLPFWLLTPVLVSRSIRSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RLLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDICPDELEKLVQV 
VRQLEAEPGLPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGL 
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(A=Alanine, C=Cysteine, b^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHistidine, I«Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N^Asparagine , 
PfcProline, Q=Glutamine, R=Arginine, 
SsSerine, T=*Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 








TGSTKQVAQASHSYRVYYNAGPKDEDQDyiVDHSIAlYLLNPDG 
LFTD Y YGRS RS AEQ I S DS VRRHMAAFRS VLS 


7025 


232 


832 


ERNSPIGNNENL*K\HSLDCLCFRGDWEGNTQPQTLQDNQB'ECF 
KQVIRTCEKRPTFNQHTVFNLHQRLNTGDKLNEFKELGKAFISG 
SDHTQHQLIHTSEKFCGDKECGNTFLPDSEVIQYQTVHTVKKTY 
ECKECGKSFSLRSSLTGHKRIHTGEKPFKCKDCGKAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 


328 


1146 


NPNPS IGD1 KDI KKAAKSMLDPAHKSHFHPVTPSLVFLCFI FDG 
LHQALLSVGySKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
SSVLRHCCDLLIGVAAGSSDKICTSSLQVQRRFKAMMASIGRLS 
HGESADLLISCNAESAIGWISSRPWVGELMFTFLFGDFESPLHK 
LRKSS*LPRKHR*QPINAVRMFLDQCMDGSIALRAIVSEIPVFE 
EKKlWG*KGIGEIF*WGCTLPPHYWGAVTTNVPKLSNSGKIiLG 
QJDEQPHIFG 


" 7027." 


43 


954 


GRRLQQQQR PEDAEOGAEGGGKRGEAGWEGG YPE I VKEN KL FEri ' 
Y YQELKI VP EGEWGQ FMDALRE PL PATLR ITG YKSHAKE I LHCL 
KNKYFKELEDLKMIXSQKVEVPQPLSWYPEELAWHTNLSRXILKK 
S PH LEKFHQFLVS ETE SGN I SRQE AVS MI P PLLLNVRPHHKI LD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYIjLVH 
QAKRLSSPCIMWNHDASSIPRLQIDVDGRKEILFYDRILCDVP 

csgdgtmrknidvwkkwttlnslqlhglqlriatrgaeql 


7028 " 


189 


608 


srpppepepgtmvekgsdsssekggvpgtpstqslgsrnkirns^ 

kkmqswysmlsptykqrnedfrklfsklpeaerlivdyscalqr 

eillqgrlylsenwicfysnifrwettisiqlkevtclkkexta 

KLIPNAIQ 


7029 


1343 


40 


-VLESNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGIAQLRLWG 
/ PCPHAGRETGPRAS API PGS * GHGWHW *RKDQRQERS EG PSAL 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA * RSLPGAAAS ERTEMTKERS P /R PCQG YD S SNWFTQ PGKK 
TRKRNSRRNTMVSRGGGCLLYPLQSIMPE*QLR*GAHASPPTQG 
R*GKGGPRSPLTKASGTTHIPTPFFGSIP/RPTRDSGPGTDNS\ 
AAPGQKRGHREA*QGPEPV/WGRVTTHLQGPAG*TKPLGS\RNW 
VPGPAEGEQGEGAGLBGRP * PLKGCRSTLTFS PQLS 1 PMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 

PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


70*6 


2 


521 


FVCFSAPGSGQGGKRRVKMELSAVGERVFAAEAUjKRRIRKGRM 
EYLVKWKGWSQKYSTWEPEENILDARLIJ^EEREREMELYGPK 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGIRI pypgrspqdl 
ASTSRAREGLRN \ RVCPRQRAAPAPAAP \ PRRGPSGPGPRPQ * G 
PGLHFPGPGGPSKHGFVPASEQHQHQQHLPRRGPSGPGPRPG 


7031 
"'70*2 " 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/CKPS/RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCLA 
PRRLTEPPALTVSPVGRAAPSGAL*PSGRACSACSHRLAPEAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPSPARSVPP 

SGSTAS HSRRGC* S PR * TPAP PRRDHGRSAAFE VLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7033 


1393 
*89 


2104 
815 


RRPGRTEPVEPpPVPPpPRASNSKSRCR*RNLHLAPI**QSPljRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
BPWMKRQFGRLHSLFWKSWQKMNSPLLTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*IiSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 

RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A= Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, KuLysine, 
L«Leucine, M=Methionine, N»Asparagine, 
P- Proline, Q«Glut amine, R^Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCT^SVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT*SSSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTODGHDTWGSFSLTLIDALD 
TLL\TLFYFQI LGNVSE FQRWE VLQDSVDFD I DVNASVFETNI 
R WGGLLS AHLLS KKAG VE VEAGWPCSG P LLRMAE EAAR KLL PA 
FOTPTGMPYGTVNLLHGVNPnPTPVTPTZlrtTn'M?Ti/T7Pzi.T'T q or 
TGD P VFED VARVALMRLW ES RS D I GLVGNHI D VLTGKW VAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
YLWVQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYY 
TVWKQ FGGLPE FYNI PQG YTVE KREG YP LRP EL I ES AM YL YRAT 
GDPTLLELGRDAVES IEKI SKVECGFATI KDLRDHKLDNRMES F 
FLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPI D PAALHCCQRLKEEQ WEVEDLMRE F YSLKRSRS KFQ 
KNTVSSGPWEPPARPGTLPSPENHDQARERKPAKQKVPLLSCPS 
QP FTS KLALLGQ VFLDS S * PLDNF FI F I FLRLN YNKLLLAI I KK 
K 


7035 


92 


1942 


EDTSSMPFRLL I PLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDS YLENAFP FDELRPLTCDGHDTWGSFS LTLI DALD 
TLL\ TLFYFQ I LGNVSEFQRWEVLQDS VDFDIDVNASVFETNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 

TGDPVFEDVARVALMRL WE S RSDI GLVGNHI DVLTGKWVAQDAG 
I GAG VDS YFE YLVKGAI LLQDKKLMAM FLEYNKAIRNYTRFDDW. 
YLWVQMYKGTVSMPVFOSLEAYWPf5I.fi s T.TRHT r»MHMO t pt.mvv 
TVWKQFGGLPEFYNI PQG YTVE KREGYPLRPELIESAMYLYRAT 
GDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESF 
FLABTVKYL YLLFDPTNF 1HNNGS TFDAVI TPYGECILGAGG YI 
FNTEAH P I D PAALHCCQRLKEEQWE VEDLMREFYS LKRSRS KFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS * PLDNFF1 F I FLRLNYNKLLLAI I KK 
K 


703* 


442 


761 


CLAPLFSCFQI2NLHLAPSGRLRWAWLRGPGRN*LPGEGPSiPT 
RNW* BRKAGCSQP C/ PAQQHHGRPPGVS PLPRDPHPTTLR PLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


" " 442 


7*1 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN*LPGEGPS I PT 
RNW* ERKAGCSQPC/ PAQQHHGRP PGVS PLPRDPHPTTLR PLPP 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHEIS PGHDGTWNDNQ 
LQEMAQLRI KHQEELTELHKKRGELAQ \RVI DLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQITFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE*KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7039 


155 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHE IS PGHDGTWNDNQ 
LQEMAQLR I KHQEELTELHKKRGELAQ \R VIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQITFTALEGKLRKTTEENQELVTRWMABKAQEANRLNARE* KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYESVMRDS EATGSAS SAQDSTS ENSSS VGGRCRSLKTPKKRSN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
WoTryptophan, Y^Tyxosine, X=Unknown # *=*Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PGSQRRRUIPALSLDTSSPVRKPPNSTGVRWVDGPLRSSPRGLG 
E P FE I KVYE I DD VE RLQRR RGGAS KEAM CFNAKLKI L EHRQQR I 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEPDLEQVWELDSLE 
YLEALECVTERL ES R VNFCKAHLMM ITCFD IT 


7041 


l 


567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSBL 
NDGYDWGRbNLQSVTEQSSLDDFLATAEIjAGTEFVAEKLNIKFV 
PAEARTGLLS FEESQRI KKLHEENKQFLC I PRRPNWNQNTTPE E 
LKQAE KDNFLEWRRQL\ VRLEE EQKL I LT P FE RKLD FWRQLWR V 
IERSDIWQIVDA 


7042 


7 


,„ 


PIHMAAAALRADI\ISPLFPHIQGYLLLSASHG\ATSLHTKGAX 
PLETVTMYTVIPKSKYVLVKPDTQYPYSENLDEFKRLAENSASN 
DDLLMAEVAISDYGDKLTLELREKy 


"7043 


2 


2170 


ARGMAARDSDSE EDL VS YGTG LE P LEEG ERPKK P I P LQDQTVRD 
EKGRYKRFHGAFSGGFSAGYFNTVGSKEGWTPSTFVSSRQNRAD 
KSVLGPED FMDEEDIiS EFG I APKAI VTTDDFAS KTXDRI REKAR 
QLAAATAP I PGATLI>DDLITPAKLS VGFELLRKMGWKEGQGVGP 
RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGBHFNLFSGG 
SERAGDLGEIGLNKGRKLGISGQAFGVGALEEEDDD1YATETLS 
KYDTVLKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILDGF 
S LAS KPLSSKKIYPP PE LPRD YRP VHYFRPM VAATSENSHLLQ V 
LSESAGKATPDPGTHSKHQLNASKRAELLGETPIQGSATSVLEF 
LSOKDKERIKEMKQATDLKAAQLKARSLA0WAQSSRAQPSPAAA 
AGHCS WNMALGGGTATLKASNF KP FAKDPE KQKR YDE FLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THAKEEDDSDOVEVPRDQENDVGDKQSAVKMKMFGKLTRDTFEW 
HPDKLLFQ/RLVGLPRVKRDKYSVFNFLTLPETASLPTTQASSE 
KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFLRLARSKAEPPK 
QQSSPLVNKEEEHAPELSAN 


7044 


276 


734 


EVYLTDEFAKGRKVADLYELVQYAGNIIPRLVLLiTVGWVVKS 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG 
EPTDEETTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELRILVGTNLVRLSQV 


"?045 


3 


513 


LGFKMEALSRAGQEMSLAAIjKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTDI EGTLFVYRRSAS PYHGFTIVNRLNMHNLVEPVNK 
DLEFQLHEPFLIjYRNASLSIYSIWFYDKN0CHRIAKLMADWEE 
ETRR5QQA/RSGQTESQPGQWLQR PQAHRHPGDAEQSQG 


7046 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTDI EGTLFVYRRSAS PYHGFTIVNRLNMHNLVEPVNK 
DLEFQLHEPFLLYRNASLSIYSIWFYDKNDCHR1AKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 
7048 ■-" 


103 


486 


QMKIEKCXSWSEGLTSIKGNCHNFYTAISKDVTYKELKNLIiNSKN 
IMLIDVREIWEILEYQKIPESINVPLDEVGEALQMNPRDFKEKY 
NEVKPSKSDS / 1 VFS YLAGVRSKKALDTAISLGFHS YYER 




92 


£27 


FFCLTLLSSWDYRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
" * j. * rwvitfidi* x vxi£,ijKJi c UbiioKb xcIAbLETQLQQIETRN 
RDLLSENNRLRMELETIKEKFEVQHSEGYRQISALEDDLAQTKA 

IKDQLOKYIRELEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
EKKW 


7049 
7050 


393 
393 


938 
938 


KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKI PARRCYEDEL " 

VPVFEAVGRIYELRLMMDFDGKNRGYAFVMYCHKHEAKRAVREL 

NNYElRPGlUJXSVCCSVDNCRIiFIGGIPKMKKREEILEEIAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 

KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKIPARRCYEDEL 
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SEQ 
IP 
NO: 


neuiccea 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co r r e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=»Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








VP VFEAVGR I YELRLMMDFDGKNRGYAF VM YCHKHEAKRAVREL 

NNYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREEIIjEEIAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVRBPPRGCHWLGRKLIAWX 
ASSLWG 


7051 


119 




Bib 


KKMNLAEICDNAKKGREYALLGNYDSSMVYYQGVMQOIQRHCQS 
VRDPAIKGKWQOVRQELLEEYEQVXSIVGTLESFKIDKPPDPPV 
SCQDEPPRDPAVWPPPVPAEHRAPPQIRR/RQSRSKTSEERNGR 
SRS PGTCRPST\ P ISKS EKP STS R DKD YRARGRDDKGR KNMQDG 

ASDGSMPKFDGAGYDKDLVEALERDIVSRNPSIHWDDIADLEEA 
KKLLREAGVIiPMWM 


7052 
7053 


467 


715 


SCPGRGKMSKLI^PEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAARQGPRR 




467 


715 


SCPGRGKMSKbliNPEEMTSRDYYFDSYAHFGlHEEMLKDEVRTL 
TYRNS^HNKHVFKDKVVLDVGSGTGILSMFAARQGPRR 


7054 


1 


1036 


GTSQRSREU'DARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG 
RRCRWDAMEYDEKLARFRQAHLMPFNKQSGPRQHEQGPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFYKEKSKSVKQTCDKCNTIIWGLIQTWYT 
CTG C YYRCKS KCLNL I S KPCVSS KVS HQAEYE LN I CP ETGLDSQ 
DYRCAECRAPI/CS/DGVVPSEARQCDYTGOYYCSHCHWWDLAV 
I PARWHNWDFEPRKVSRCSMRYLALMVSRPVLRLREI N 


7055 
76S6- 


c 


527 


DS RR VS WRS WLANE / WG KHLCLF I WLS MNVLL F W KTFLL YNQGP 

EYHYLHQMLG/ ALCLSRAS ASVLNLNCSL ILLPMCRTLLAYLRG 

S QKVPSRRTRRLLDKSRT FH ITCGATI CI FSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 
M 




2 


527 


DSEKVS WRS WLANE / WG KHLCLF I WLS MNVLL Jb'WkTFLLVNQG P 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 
SQKVPSRRTRRLLDKSRTFHITCGATI CI FSG VHVAAHL VNALN 
FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVLFL 


7057 
7058 


1368 


431 

r 


GIYLHVNEKIPRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 
S PQER I S E KQLGQHL PNP HS G EM STM WLEE KRETS Q KG QP RAP M 
AQKLPTCRECGKTFYRNSQLIFHQRTHTGETYFQCTICKKAFLR 
SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGLRHHEKIHTGEKP 
YKCPICEKSFIQRSNFNRHQRVKTGEKPYKCSHCGKSFSMSSSL 
DKHQRSH LGKKPFQ * PVTKLS FP I S I SQPSHKNTQLHQEELCLR 
GYPC 






1 


469 


t'SGFGAVPDAJjGCRMSDLRITEAFLYMDYLCFRALCCKGPPPAR 
PEYDLVC I GLTGSGKTS LLS KLCS ES PDNWSTTGFS I KAVPFQ 
NAI LNVKELGG ADN I RKYWSRYYQGSQG VI FVLDS AS S EDDL EA 
ARN * S CTQLLQH PQLCTL PFL I LA 




7059 

' Hncn. 


1 


1178 


WPAFPRQPAAAAMDALLGTGPRRARQCLGAAGPTSSGRAARTPA " 
APWARPSAWLECVCWTFDLELGOALEljVYPMnP'PT Tntrtrvcc t 

CYLSFPDSHSGCLGDTOFSFRMRQCGGQRSPWHADDRHYNSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALLSLIAPEYFDKLAPCLBAVCSEIDQWPAPAPGQTLNLPVM 
GVWQVRI PSRVDKSESSPPKQFDQENLLPAPWLAS VHELDLF 
RCFRPVLTHMQTLWELMLLGEPLLVLAPSPDVSSEMVLALTSCL 
QPLRFCCD PR P YFT IHDSEFKEFTTRTQ AP PNWLG VTNPFF I K 
TLQHWPHILRVGEPKMSGDLPKQVKLKKPFKV*RPWDTKP 




90 


1670 


&VWLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLVNPSQ 
YRFEHLVTQMKWRLQEGRGEAVYQ IGVEDNGLLVGLAEEEMRAS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MeMethionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=;Serine, T=Threonine, V-Valine, 
WeTryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IiKTLHRMAEKVGADITVLREREVDYDSDMPRKITEVLVRKVPDN" 
QQFLDLRVAVLGN VD SG JCSTLLG VLTQGE LDNGRGRARLNLPRK 
LHE I QSGRTS S ISFE ILGFNSKGEVHG INGTQWGQTLRMGW* + * 
RT* DGGRVWRLFEI V* MNALRGL*TSS AP LRKSMGNQLN* IKNG 
VKIKRQGHPGNGLGPGKSEGVGRAGRRH*GPWALGOWNYSDSR 
TAEEICESSSKMITFIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
S ANTG I AGTTREHLGLALALKVPFFI VVS KI DLCAKTTVERTVR 
QL ERVLKQ PG CHKVPMLVTS EDDAVTAAQQFAQS PNVTPI FTLS 
SVSGESLDLLKVFLNILPPLTNSKEQEELMQQLTEFQVDEIYTV 
PEVGTWGGTLSR* IDLLATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPS PLGP PCLPVMDPETTLE E PETARLRFRGFC YQE VAGPRE 
ALARLRE LC CO^^OPEAHSKEOMLEMtiVIjRn PT .T> P p t natav 
RGQR PGS PE EAAAL VEGLQHDP * ARMPS PLG P PCLP VMDPE TTL 
EEPETAPXRFRGFCYQEVAGPREALARLRELCCQWLQPEAHSKE 
QMLEMLVLEOFLGTLPPEIQAWVRGQRPGSPEEAAALVEGI>QHD 
PGQLLG 


7062 


71 


744 


AKAGTNLERLHWLS YF PC I PKH KLKS SOKDICVROPM APTn ahpp 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGRYKDPQDENKIGVDGIQQPCDDLSLDPASISVLVIAWKFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQEIiKDTAKFKD 
FYQFTFTPAKNPGQKGLDL*MAGAYWKLVLSGRFKFLYLWNTFI» 
MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR*PELPPDMNSLEOAEDtKAFERR 
LTEY IHCLQPATGRWRMLLIWS VCTATGAWNWLIDPETQKVS F 
FTSLWNHPFFTISCITLIGLPFAGIHKRWAPS I IAARCRTVLA 
EYNMSCDDTGKLI LKPRPHVQ* QSS LI VMGLKIAFLR1 SDTAKS 
HKGFLLRLDM 


7064"- 


300 


884 


RDTGSDPSSTRRLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
SRRTGCWTCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC 
SSRAGRWLETPGRRRGPPACAAAAGRLRGPAP*AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRS PGPS PKSAAP 
PLLTPLGAGRAGGSRANS 




1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLER IGKGS FGE VFKG I DNRTQQ WAI KI I DLEE AKDE I EDI 
0QEITVLS0CDSSYVTKYYGSYLKGSKLWII^5EYLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASIiRSNVRAATMMQICDT 
YNQKHS LFMAMNR F IGAVNNMDQTVMV PSLLRDVPLAD P GLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KENITMATEIGSPPRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAGYYNDLVPPIGMLNNPMNAVTTKFVRTSTNKVKCPVFWRW 
TPEGRRLVTGASSGEFTLWNGLTFWFETILOAHDSPVRAMTWSH 
NDMWMLTADHGG YV KYWQSNMNNVKM FQAHKEAI REAR F IHN I P 
FS WP I VMVKLFS KC I LGAEMHGLCQFLGNFLHP I NT! FFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEWI^LFLALCSAKPFFSPSHIALKNMMLKDMEDTDDDDD 
DDDDDDDDDDEDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCSDLGLTS VP TN I PFDTRM LDLQNN K I KE I KENDFKGLTS L Y 
GL I LNNN KLTKIH P KAFLTT KKLRR L YLS HNQ LS E I PLNLP KS L 
AELRIHENKVKKIQKDTFKXK 


7065 


1147 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETLAKQ 
TLKDKTGTDSNSTESSETSTGSLCKES FSGQVSS SSLMPLT P FW 
TLLQSNVPVLQPPIiPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, DoAepartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G^Glycine, 
HaHistidine, I»Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
5=Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E KTKKGRKDKAKKS KTKM PS LVKKWQS IQRE LDEEONS S S SEED 
RVS TAQKR I EEWKQQQL VSGMAERNANFE A 


7070 


1 


! 547 


DGTMEDSEAVQRATAL I EQRLAQEEENE KLRGDARQKL PMDLLV 

LEDEKHHGAOSAALQKVKGQERVRKTSriDLRREIXDVGGIQNLI 

ELRKKRKQKKRDALAASHEPPPEPEEITGPVDEETPLKAAVEGK 

MKVIEKPLADGGSADTCDQFRJRTALHRASLEGHMEILBKLLDNG 
ATVDFQ 


mi 


2 


; 921 


ARGTLRALETAKKVGKVGANGQKAAGPS ADS VTENK IG S P P KT P " 
VSNVAATSAGPSNVGTELNSVPQKSSPFLTRVPAYPPHSENtQY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSI.PPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
S LRERYNSLDG YYS VACQ P PS E PRTT VPLPREPCQHLKT5 CEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


"7072 


2 


921 


ARG TL RALE T AKKVG KVG ANGQKAAG P S ADS VT EN KI GS P P KT P~ ~ 

VSNVAATSAGPSNVGTELNSVPQKSSPFLTRVPAYPPHSENIQY 

FQD P RTQ I P FE V PQ YPQTG Y Y P P P PT VPAGVAPCV PRF VRSNNV 

PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 

PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 

SLRERYNSUDGYYSVACQPPSEPRTTVPIiPREPCGHLKTSCEEQ 

IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7073 


50 


504 


LAHGSFGVSDPPAPAAAPAHTIiTSFSGSLSPQFRKPLGRAPAMP 
LVRYRKWI LGYRCVGKTSLAHQFVEGEFSEGYD PTVENTYSKI 
VTLGKDEFHLHLVDTAGQDEYSILPYSFI IGVHGYVLVYSVTSL 
HSFQVI ESIiYQKLHEGHGK 


7074 


263 


1003 


VCPVLCSTRQEPGHSSLVTYFGKPTRRKEFLLGHCIAAGKMWIS 
\mLETNYAELVU)VGRVTLGENSRKKMKDCKLRKKQNERVSRAM 
CALLNSGGG VI KAE IENED YS YTKDG IGUDLENS FSNILLF VPE 
YLDFMQNGNYFLIFVKSWSI.NTSGLRITTLSSNLYKRDITSAKV 
MNATAAIjE FLKDMKKTRGRIi YLRPEIjLAKRPRVD I QEENNM KAL 
AGVFFDRTELORKE KLTFTESTHVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRLSRLQGVBRIMKKTEESESQ 
VEPEI KRKVQQKRHCS TYQ PTP PLSPAS K KCIiTHLEDLQRNCRQ 

AITLNESTGPLLRTSIHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


" 7076 


279 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWSKGRKRKKPLRDSNAPK " 

SPLTGYVRFMNBRREQLRAKRPEVPFPEITRMLGNEWSKLPPEE 

KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 

RQDAARQATHDHEKETEVKERSVFDIPIFTEEFLKHSKAREAEL 

RQLRKSNMEFEERNAALQKHVESMRTAVEKLEVDV1QERSRNTV 

LQQHLETLRQVLTSSFASMPLPBXGETPTVDTIDSYM 


7077 


3 


1119 


SSMGSNSEINGLALRKTDKYGFLGGSQYSGSLKSSIPVDVARQR 
ELKWLDMFSNWDKWLSRRFQKVKIiRCRKGIPSSLRAKAWQYLSN 
SKELLEQNPRKFEELERAPGDPKWLDVIEKDLHRQFPFHEMFAA 
RGGHGGQDLYRILKAYTIYRPDEGYCQAQAPVAAVLLMHMPAEQ 
Ar w-AjV«jjll.Ujv,i Jbrl*x xbAbJuEAIQLiDGEIFFALIiRRASPLAHR 
HLRRQRIDPVLYMTEWFMCIFARTLPWASVLRVWDMFFCEGVKI 
I FRVALVLLRHTLGS VEKLRSCQGM YETMEQLRNLPQQCMQEDF 
LVHEVTNLP VTEALI ERENAAQLKKWRETRGELQYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 " " 


FQGQRMAGEQKPSSNLLEQFitLLAKGTSGSALfALlSQVLEAPG 
VYVFGELLE LANVQE LAEGANAA YLQLLNLFAYGTYPDYI ANKE 
SLPELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
MQARKKRRGIIEKRRRDRINSSLSEI^RLVPTAFEKQGSSKLEK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I»Isoleucine, KoLysine, 
L=Leucine. M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








AEVLQMTVDHLKMLHATGGTGTHALLFQAS FIQQI F 


7080 


200 


595 


VQLPLEAPCLSIiLSCRDHSGGNRDLSRRHRDCRVYGSPQDGIPY" 
LTHPLCHQDWSVGRLQIRAIATPGHTQGHLVYLLDGEPYXGPS 
CLFSGDLLFLSGCGEFPRKRBELGEEGBTEVRAATVPWRALKP ' 


7081 


213 


506 


A VTEEEM I LNS LS LC YHNKL I LAPM VR VGTLPMRLLALDYGADI 
VYC E EL I DL KMIQCKRVVNEVLSTVDF VAPDDR WFRTCEREQN 
RWFQMGTS 


7082 


3 


1137 


APSRNTMLMAWCRGPVLLCLRQGW3Tr)SFliHG3LGOEPFEGAft3Xi 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
OAQSATEVEERHVSPSCSTSRERPFQAGELILAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFGKI VGKFPGQ I LRSS FGKQYMLRRP 
ALEDYWLMKRGTAITFPKDINMILSMKDINPGDTVLEAGSGSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDLAKKNYKHWRDSWKLSH 
VEE W PDNVD F I HKD IS GATED I KS LTFDAVALDMLN PH VTL PVF 
YPHLKHGGVCPVYWN I TQVI ELLD 


7083 


115 


541 


RSNAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTQQLLSEP 
SPKAPRARPCRVSTADRSVRKGIMAYSLEDLLLKVRDTLMLADK 
PFFLVLEEDGTTVETEEYFQALAGDTVFMVLQKGQKWQPPSEQG 
TRHPLSL3HK 


7084 


3 


522 


NS VS VSS QS RFLAS VPGTGVQRS AAADMAASTAAG KQR I PKVAK 
VKNKAPAEVQ ITAEQLLREAKERELELLPPPPQQKITDE EELND 
YKLRKRKTFEDNI RKNRTVISNW I KYAQWEES LKEIQRARS I YE 
RALDVDYRNITLWLKYAEMEMKNRQVNHARNIWDRAITTL 


7085 ' 


243 


1499 


RQLARLRRRG WRS PFGGAPMAH ITI NQ YLQQVYEAIDSRDGAS C 
AELVSFKH PHVANPRLQMASPEEKCQQVLE PP YDEMFAAHLRCT 
YAVGNHDF I EAYKCQTVI VQS FLRAFQAHKEENVJALPVMYAVAL 
DLRVFANNADQQtjVKKG KS KVGDML E KAAE LLMS CFRV CASDTR 
AGIEDSKKWGMLFLVNQLFKIYFKINKLHLCKPLIRAIDSSNLK 
DD YS TAQRVTYKYYVGRKAMFDSDFKQAEE YLS FAFEHCHRSSQ 
KNKRMILIYLLPVKMLLGHMPTVELLKKYHLMQFAEVTRAVSEG 
NLLLLHEALAKHEAFFIRCGIFLILEKLKIITYRNLFKKVYLLL 
KTHQ LS LDAFL VALKFMQVEDVD IDE VQCI LANLI YMGHVKG Y I 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


ILAARMGKQNSKLRPEVMQDLLESTDFTEHEIQEWYKGFLRDCP " 

SGHLSMEEFKKIYGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 

EF 


7087 


166 


723 


LSGS S AGKVAAP CVPPSNHELVP I TTENAPKNWDKGEGASRGG 
NTR KS LEDNGSTRVTPS VQ PHLQ P I RNMS VS RTMEDSCELDLVY 
VTERI I AVS FPS TANEEN FRSNLRE VAQMLKS KHGGNYLL FNLS 
ERRPDITKLHAKVLEFCWPDLHTPALEKICSICKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAAS PS S LLEMAGE I TETO>ELYSS Y VGLVYMFNLI VGTGALT 
MPKAFATAGWLVSLVLLVFLGFMS FMTTTFVI EAMAAANAQLHW 
KRMENLKEEEDDDSSTASDSDVLIRDNYERAEKRPILSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLPYFCI IVYLYGDLAI YA 
AAVPFS LMQVTCSATGNDS CGVEADTKYNDTDRCWGPLRRVD 


7089 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC^ 

HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 

SPIHTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 

PRPMDELVTLEE ADGGSDI LLWP KATVLQNQLDE S QQE RND LM 

QLKLQLEGQVTELRSRVQELERAJjATARQEHTELMEQYKGISRS 

HGEITEERDILSRQQGDHVARILELEDDIQTISBKVLTKEVELD 

RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 

NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
p=proline, Q=Glut amine, R»Arginine, 
S=Serine, TVThreonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








LKEQbRGAQBLAASSQQKATLLGEELASAAAARDRTlAELHRSR 
LBVAEVNGKLAELGLHLKEEKCQWS KERAGLLQS VEAE KDKI LK 
LS AE I LRLEKAVQEERTQNQV F KTELAREKDS SLVQLS ES KR EG 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLSCPAALTDS EDES PEDMRLHPMAFVSVETQ 
ASLLLGLE 


7090 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
S P IHTS VQ FQAS YLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 

QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERD I LS RQQGDHVAR I LELEDDI QTI S EKVLTKE VELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
LKEQLRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 
LBVAEVNGKLAELGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 

TELRSALRVLQK2KEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEOEEAAVGLSCPAALTDSEDES PEDMRLHPMAFVSVETQ 
ASLLLGLE 


7091 


186 


1076. 


EGMLTREHRCGRSEEQELEPWPSPKKARSGRWLRNGFKRKMEEP 
EE PADSGOSLVP VY I YS P E YV ^MPD <5 T A K T S> K R a cmuhc t . T B*a V 

ALHKQMRIVKPKVASMEE^TFHTDAYLQHLQKVSQEGDDDHPD 
SI E YGLG YDCPATEG I FD YAAAI GG AT I TAAQCLI DGMCKVA IN 
WSGGWHHAKKDEASGFCYLNDAVLGILRLRRKFERILYVDLDLH 
HGDGVEDAFS FTS KVMTVS LHKFSPGFFPGTGDVSDVGLGKGRY 
YS VNVP IQDG IQDEKYYQ I CER YEP PAPNPGL 


7092 


522 


809 


KQGINEDQEESQKPRLGEGCEPISKRQMKKLIKQKQWBEQRELR 
KQKRKEKRKRKKLERQCQMEPNSDGHDRKRVRRDWHSTLRLII 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQAS MVRMS FVIAACQLVLGLLMTSLTESS IQNS 
ECPQLCVCE IRPWFTPQST YREA 


7094 


2 


SOB 


FVRSMHWGVGFASSRPCWDLSWNQS I SFFGWWAGSEEPFS fyg 
DI IAFPLQDYGGIMAGIX5SDPWWKKTLYLTGGALLAAAAYLLHE 
LLVIRKQQE IDSKDAI ILHQFARFNNGVPSLS PFCLKMETYLRM 
ADLP YQNYFGGKLSAQGKMPWI EYNHEKVSGTEF 1 1 


7095 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR 
S LECVS HE VDSH YC PS CLENMPS AEAKLKKNRCANCFDC PGCMH 
TLSTRATS I STQLPDDPAKTTMKKAY YLACG FCRWTSRDVGMAD 
KSVGE 


709S 


224 


2067 


ETRS LAVQEKPS QAGRRRS S RIS FAGALFLTRFLLQELLLNN FC 
SAMSPAPDAAPAPASISLFDLSADAPVFQGLSLVSHAPGEALAR 
APRTSCSGSGERESPERKLLQGPMDISEKLFCSTCDQTFQNHQE 
QREH YKIiDWHRFNLKORLKDKPLLSALDFEIcn*? QTYnnT.QQ T Cnc 

EDSDSASEEDLQTLDRERATFEKLSRPPGFYPHRVLFQNAQGQF 
LYAYRCVLGPHQDPPEEAELLLQNLQSKGPRDCWLMAAAGHFA 
GAI FQGRE WTHKTFHR YTVRAKRGTAQGLRDARGGPSHSAG AN 
LRR YNEATLYKDVRDLLAGPS WAKALE EAGT I LLRAPRSGRS L P 
FGGKGAPLQRGDPRLWDI PLATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRLHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGQ 
NEESPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSQAVAAPLGPL 
LDBAKAPGQPELWNALLAACRAGDVGVLKLQLAPS PADPRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSWRLLLEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCX3REGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGLETNNSNSELPLRVGLKVAQGSPLMGGQVSA [ 
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SEQ 
10 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(JUAlanine, C*Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N:=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, +=Stop 
Codon, Apossible nucleotide deletion, 
\-possible nucleotide insertion) 








SNS FSRLHCRNANEDWMSALCPRLWDVPLHHLS IPGSHDTMTYC 
LNKKSPISHEESRLIiQLLNKALPCITRPVVLKWSVTQALDVTEQ 
LDAGVRYLDLRIAHMLEGSEKm,HFVHMVYTTALVEDTLTElSE 
WLERHPREWILACRNFEGLSEDLHEYLVACIKNIFGDMLCPRG 
EVPTLRQLW SRGQQ VI VS YEDE S S LRRHHBL Wp G VP YWWGNR VK 
TEALIRYLETMKSCGR 


7098 


82 


956 


SSFLKRCRKVLGCWGIPSEQSLFSTLBEPRDKEIDNYCVMRLQT 
EARSGFWAPNRFPVNICRMTAVDGDRGGSSRETCRCHFHPSLEA 
LVLLLQDWQPGGVGICTSFLGISWALLDYHRALRTCIjPSKPLLG 
LGSSVIYFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
LLWVWI^TDFMPDPSSEWLYRVTVATILYFSWFNVAEGRTRGR 
AI IHFAFLLSDS ILLVATWVTHSS WLPSGI PLQLWLPVGCGCFF 
LGLALRLVYYHWLHPS CCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRSIiARQGYHQIWAFPFLPSGATATWPAASRSRSLA'" 
ARSLPRSPARPGPNDALIiGEHDFRGQGVRAQRFRFSEBPGPGAD 
GAVLEVHVPQIGAGVSLPGILAAKCGAEVILSDSSELPHCLEVC 
R0S CQMNNLPHLQ WGLT WGH I S WDLLALPPQJD I ILASD VFFEP 
ED FED I LATIYFLMHKNPKVQLWSTYQVRSADMSLEALLYKWDM 
KCVHIPLESFDADKEDIAESTLPGRHTVEMLVISFAKDSL 


7100 


205 


671 


ANGG FWEAAPG3 EVSLPLWVPTASHSKTTALGIGSAPPPHLS VL 
FIiFS FP PQLGDPLEAFP VFKKYDRNGIiNVS I ECKRV5GLE PATV 
DWAFDLTKTNMQTM YEQS EWGWKDREKREEMTDDRAW YL I AWEN 
SSVPVAFSHFRFDVERGDEVLYW 


7101 


2 


S03 


WRGGPRRAKRlJUjGAVGWV^LV^GVHSVRAGGGRPPRAAbMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVD YSEAEQSDEQLHQEI SQANVI CI VYAVNNKHS I DK 
VTSRWIPLINERTDKDSRliPlilLGGNKSDLVBYSR 


7102 


2 


503 


WRGGPRRAKRLAGGAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 
VR I LLVGEPR VG KTSLIMS LVSE E FPEEVP PRAEE I TI PAD VTP 
ER VPTH I VD YS EAEQSDEQ LHQE I S QANVI CI VYAVNNKHS I DK 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVBYSR 


7103 


119 


438 


gsqssvavnirsgtdeesmdlmngqassvniaatasekssss'e"s"~ 
lsdkgs elkks fdawfdvlkvtpeeyagqitlmdvp vfkaiqp 
delsscgwnkkekyssap 


7104 


1670 


795 


RLWEHRSVSAGASGWGLSSPGC^IOIPSLPEEERVDILII^AGV'' 
MRCPHWTTEDGFEMQFGVNHIjGEAWAGAAPWVQAI lprrppkvl 
G F * V * V KS DLF 1 1 LNPGHFLLTNLIiLDKUKAS AP SR 1 INLS S LA 
HVAGH IDFDDLNWQTRKYNTKAAYCQS \KLAI VLFTKELSRRLQ 
GSGVTVNALHPGVARTELGRH'TGIHGSTFLQHHN\WAtILLAAWS 
KSPRSWPAPAQHNTIiAVAEELA\VISGKYFDGLKQKAPAPEAED 
EEVARRLWABSARLVGLEAPSVREQPLPR 


7105 


765 


143 


gqmcrrpspkstsclsmtcdlp/rglqdpqclalfrvavdkhqa" 
llkaamsgqgvdrhlfalyivsrflhlqspfltqvhseqwqlst 
sqipvqqmmlfdvhnypdyvssgggfgpaddhgygvsyifmgdg 
mi tfh I s s kks stktds hrlgqhi edalld vas l fqagqhfkrr 


7106 
7107 


14 
1145 


1064 
591 


glqaghphprsasripeadth\ysklqrafdsivnkdhkrmfgt 
yfrvgffgskfgdldeqefvykepaitklpeishrleafygqcf 
gaefvevikdstpvdktkldpnkayiqitfvepyfdeyemkdrv 
tyfeknfnlrrfmyttpftlegrprgelheqyrrntvlttmhaf 
pyiktrisviqkeefvltpievaiedmkkktlqlavainqeppd 

AKMLQMVLQGSVGATVNQGPLEVAQVFLAEIPADPKIjYRHHNKL 

rlcfkefimrcgeaveknkrlitadqreyqqblkknynklkbnl 

RPmierkipelykpifrvesqkrdsfhrssfrkcbtqlsqgs 

*i*wlqtgkkk 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond ing 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, P« Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, YaTyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion,. 
\=possible nucleotide insertion) 


7108 


1 


942 


VKVALLLTNLEQPRTESBWENSFTI/KMFLFQFVNLNSSTFYIAP 
FLGRFTGHPGAYLRLlNRWRIiEECHPSGCLIDLCMQMGIlMVLK 
QTWNNFMELGYPLIQNWWTRRKVRQEHGPERKI SFPQWEKD YNL 
QPMNAYGLFDEYLEMILQFGFTTIFVAAPPLAPLLALLNNIIEI 
RLDAYKFVTQWRRPLASRAKDIGIWYGILEG1GILSVITNAFVI 
AI TSDFI PRLVYA YKYGP CAGQGEAGQKCMVG Y VNASLS VFR I S 
DFENRSEPBSDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQF 
WHVLAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLGAAQEALSIQLQPKE 
TQPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 
SEKLATDTSTFEATSEGTLELQQRNPKAERLRWSPAQEESFRQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGBKPYKC 
SDCGKTFKQSSNLGQHQRIHTGEKPFECNECGKAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 
KAYGWCSBLIRHRRVHARKEPSH 


7110 


96 


697 


RLDN FSGFLVEVTKEERH I VKPLYDR YRLVKQMLTRASITP VLG " 
SPSTKRRGQMLQPI I EGETAHFFEE I KEEEEDGVNLSSELGDML 
KTAVQVQSSLKNS ESDVEENQEKLALDLRLSSS RAASMPELLEQ 
LWKARAEKKKLRKTLREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
YKKIKAKLRIiLEVIiISKQDSSKSI 


I 7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFAIjELNELTAE 
IiKRSLPSXDTRLRPDQRYLEEGNlQAAEAQKRRIEQljQRDRRKV 
MEENNIVHQARFFRRQTDSSGKEWWVTNNTYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRtlGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW~ 
FKNDQD IQLSEHFS VKVEQAKYVSMT I KGVTS EDSGKYS INI KN 
KYGGEKIDVTVSVYKHGEKI PDMAPPQQAKPKLI PASASAAGQ 


7113 


1 


824 


KCLRQAWHEAP SS LAFTR WCSREE RAEGGGNLHRS I TRD P KP PG 
LRPSQRPMDDKKKKRS PKPCLAQPAQAPGTLRRVPVPTSHSGS L 
ALGLPHLPS PKQRAKFKRVGKE KGRPVLAGGGSGSAGTPLQHS F 
LTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKLARIiHFSLDVCGEEEDDEEEEDGVTEGLpEEOKKTWADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWAIiGEKVGVHGSK 
SSGPLNLPRR 


7114 


3 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETIiKDESGQECKICRKI 
IYLNTDFVSVKQRIiPKYYSWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWKVCLHYNIjHKAQPAERFFDPNQRGKALHQKQALRKSQRS 
QTGEKLYKCTECGKVFIQKANLVVHQRTHTGEKPYECCECAKAF 
SQKS TL I AHQRTHTGE KP YE CS EOGKT FIQKSTL I KHQRTHTG E 
KPFVCDKCPKAFKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RIHTSEKPQCSEHGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRGKSHLS VHQR I HTGEKP Y ECS ICGKTFSG KSHLS VHHRTHTG 
E KP YECRRCGKA FG EKSTL I VHQRMHTGEKP YKCNE CX3KAFS E K 
S PLI KHQRIHTGERP YECTDCKKAFSRKS TLIKHQR I HTGEKP Y 
KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 
KHQRSHTGDKNL 


" 711 c: 


1 


947 


NAAHG YNVIGLW CM Y 1 1 P PQD WLDRGDE S A P I RTPAM I G CS FWD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPGVDFGDVSERLALRQRLKCRSFKWYLENVYPEMRVYNNTLT 
YGEVRNSKASAYCLDQGAEDGDRAILYPCHGMSSQLVRYSADGL 
LQLGPLGSTAFLPDSKCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSG P I VSRATGR CLE VEMS KDANFGLRLWQRCS GQKWM IRN 
WIKHARH 


7116 


866- 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT 
PGSVINNLS INTVRE VDHLRDRMSGSSSSLNTTLPSTSAWSS IR 
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Codon, /^possible nucleotide deletion, 
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ASNYNVPLSSTAQSTSARNSDSKLTWSPGSVTNTSLAHELWKVP 
LP P KN I TAP S RP PPGLTGQKP PLSTWDNS PLR I GGGWGNSDAR Y 
TPGSSWGBSSSGRITNWLVLKNLTPQIDGSTIiRTLCMQHGPLIT 
FHLNLPHGNALVRYSSKEEWKAQKSLHISDLFLLTL 


7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGALPQGA 
FVSQAARAIPLLQPSQAAQAEGLSQPARACGALCSLPWPLRNWG 
SPILRLPGGLRTPTNDRKTRTRSAMACWARAQWDTLGPLKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHC E PN PG AG AM VLLHVLF EHAVG Y ALLALKE VEE IS LLQPQ VE 
ESVLNLGKFH S I VR LVAFC PFAS S Q VALENANAVSEGWHEDLR 
LLLBTHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRGVRLHFHNLVKGLTDLSAC^QLGIiGHSYSRAKVKFNVNRVD 
NMI IQS ISLIiDQLDKDINTFSMRVREWYGYHFPELVKIINDNAT 
YCRLAQFIGNRRELNEDKLEKLEEIjTMDGAKAKAlLDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALI G EAVGARLI AHAGS LTNLAKY PASTVQ I LG AEKALFRALKT 
RGNTPKYGL I FHSTFIGRAAAKNKGRISRYLANKCS I ASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSI6FSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ESVLNLGKFHS I VRLVAFCPFASSQVALENANAVSEGVVHEDLR 
LLLETHLPSKKXKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRG VR LHFHNLVKGLTDLS ACKAQI*GIX3HS YS RAKVXFNVNRVD 
NMI IQSISLLDQLDKDINTPSMRVREWYGYHFPELVKI INDNAT 
YCRLAQ FIGNR RELNEDKLE KLEELTMDGAKAKAI LDASRS SMG 
MDISAIDLINIESFSSRWSLSEYRQSLKTYLRSKMSQVAPSLS 
AL IGEAVGARL I AHAGSLTNLAKY PASTVQ I LGAE KAL FRALKT 
RGNTPKYGLI FHSTFIGRAAAKNKGR ISRYLANKCS I ASR I DCF 
SEVPTSVFGEKLREQVBERLSFYETGEI PRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEFCKRLAALALASSENSSSTPEECE 
EMS EKPKKKKKQKPQEVPQ ENGMEDP S I S FS KP KKKKS FS KEEL 
MSSDLEETAGSTS I PKRKKSTPKEETVNDPEBAGHRSGSKKKRK 
FSKBEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQTANLSVVFKDS " 
NSTTPL IFVLS PGTDPAADLYKFAEEMKFSKKLSAISLGQGQGP 
RAEAMMRSS I E RGKW VFFQNCH LAPS WMPALERLI EHI NPDKVH 
RDFRLWLTSLPSNKFPVS I LQNGSKMTI EP PRGVRANLLKS YS5 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTDGDLRICISQLKMFLDE YDDI P Y KVLKYTAGE INYGGRVTD 
DWDRRC IMNI LEDFYNPDVLSPEHS YSASGI YHQI PPTYDLHGY 
LSYIKSLPLNDMPEI PGLHDNANITFAQNETFALLGTI IQLQPK 
SSSAGS QGREE I VED VTQNI LLKVPE P INLQWVMAK YP VL YEES 
MNTVLVQEVIRYNRLLQVITQTLQDLLKALKGLVVMSSQLELMA 
AS LYNNTVPELWS AKAYPSLKPLS S WVMDLLQRLD FLQAW IQDG 
I PAVFW I SGFFF PQAFLTGTLQNFARKFVIS IDTI S FDFKVMFE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRHW I KRGVAL I GALDY 


7121 


2 


546 


RPLRPW VLSLGSMVGLMTYGRRQFQSLDTTMRRL I PP FREASAK 
LTTLVDADAEA FTAY LEAMRLP KNTP EE KDRRTAALQEGLRRAV 
SVPLTLAETVASLWPALQELARCGNLACRSDLQVAAKALEMGVF 
GAYFNVL INLRDI TDEAFKDQ IHHR VS SLLQEAKTQAALVLDCL 
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" ETRQE 


7122 


2 


546 


RPLRPWVLSLGSMVGLMTYGRROFQSLDTTMRkLIPPPftfeASAK 

LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 

SVPLTUVETVASLWPALQELARCGNLACRSDLQVAAKALEMGVP 

GAYFm^LINLRDITDEAFKDQIHHRVSSLLQEAKTQAALVLDCL 
ETRQE 


7123 


1 


1092 


KPAVPEARS AGTS EAGRS GAE EVS CGS VSG DG AAMRLTP RALCS 
AAQAAWRENFPLCGRDVARWPPGHMAKGLKKMQSSLKLVDCI IE 
VHDARIPLSGRNPLFQETLGLKPHLLVLNKMDLADLTBQQKIMQ 
HLEGEGLKNVI FTNCVKDENVKQ I I PMVTELIGRSHRYHRKENL 
EYCIMVIGVPNVGKSSLINSLRRQHIjRKGKATRVGGEPGITRAV 
MSKI QVSERPLMFLLDTPGVLAPR I ES VBTGLKLALCGTVLDHL 
VGBETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVLKSVAV 
KLGKTQ KVKVLTGTGNVNVI QPNYPAAARDFLQTFRRGLLGS VM 
LDLDVLRGHPRV 


7124 


2 


^82 


lpltlllaapfahlllppghdqspcwhpSpalspgtlgplswam 

ANSGLQLLGY FLALGGWVG 1 1 ASTAL PQW KQS S YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


712S 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE" 

FIELRKWLKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMIIS 

LPESCLLT\RDTVIRSYLGAYITKWKPPPSPLLALCTFLVSEKH 

AGHRSLLEA\YLEILPKAYTCPVCLEPEWNLLPKSLKAKAEEQ 

RAHVQEFFASSRDFFSSLQPLFAEAVDSIFSYSALLWAWCTVNT 

RAVYL\SPGSGNAFLQSRTPVQLAPYI,DLLNHSPHVQVKAAFNE 

ETHSYEIRTTSRWRKHEBVFICYGPHDNQRLFLEYGFVSVHNPH 

ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFIVPSPARRCSQKGSLGHLPTQPWLWAAMSPRGQBRGT 
SHSQAREPQRPGRWLI/SSIiQSSPGTLGQAGTASRRRGCMVQRWV 
Q VATGRRAVQV P KGALGLALGETS PGASRGMS GGAGGCWALG WA 
PSPVLPSWLLEGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA+ S CVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR* IEKRPFKEI *RRIPRIF 
AKQKQ I * S * NSQK I GASE I DRGRKEADCS DAPAAAR I GAVS VFR 
RSTQEARVSPRSNAKSAKLRAVRAD^WEHFVLLFHTPEQFLAEC 
ICRST* *K*WHQLC*PLSSL*TGI.KRKLLL*VLFRI *WLKDCDV 
* FCQKI FATNFCNWQNLIQ* EE * KPVE YSVEN*HIMNLLLPM * L 
QQSS LRDQT I VTWRM* RN YSMFRINM I S S L * DGS I H I PLKLHFY 
PALI FTLTVPINSCCQRPLPLFAHQS I KTLASSGS PMLACLRFL 
LVKKRAFIHTPRS PGCSV* CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALRELSQIEAEIiNKHWRRLLEGLSYYKPPSjp 
S S AEKV KANKDVAS PLKELGLRI S KFLGLDE EQS VQLLQC YLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCHiRCVLHL 
LTYFQDERHPYRVE YADCVDKLEKELVS XYRQQFEELYKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEIIFLYYAYFEMAPSD 
LLVLTKM FKEQGFGS RQTNRHLVDETMD P FVDR I G Y FS AL I LVE 
GMDIESLHKCALDDRRELHQFAQDGLICQDMDCIiMIjTFGDIPHH 
APVLLAWALLRHTLNPEETSSWRKIGGTAIQLNVFQYLTRLLQ 

slasggndcttstacmcvygllsfvltsleuhtlgnqqdiidta 

CEVLADPSLPELFWGTEPTSGLGIILDSVCGMFPHLLSPLIiQLL 
RAL VSGKSTAKKVYS FLDKMS FYNEL YJKHKPHDVISHEDGTLWR 
RQTPKLLYPLGGQTNLRIPQGTVGQVWLDDRAYLVRWEYSYSSW 
TLFTC B 1 EMLLHWS TADVI QHCQRVK P 1 1 DLVHKVI STDLS I A 
DCLLP ITS R I YMLLQRLTT VI S P P VD VI ASC VNCLTVLAARNPA 
KVWTDIiRHTGFLPFVAHPVSSLSQMISAEGMNAGGYGNLLMNSE 
QPO^EYGVTIAFLRLITTLVKGQLGSTQSQGLVPCVMFVLKEMI, 
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PSYHKWRYNSHGVREQIGCLILELIHAILNLCHETDLHSSHTPS 
I/JFLCICSLAYTEAGQTVINIMGIGVOTIDTTVMAAQPRSIXjAEG 
QGQGQLLIKTVKLAFSVTNNVIRLKPPSNVVSPLEQALSQHGAH 
GNNLIAVLAKYIYHKHDPALPRLA2QLLKRLATVAPMSVYACLG 
NDAAAI RDAFLTRLQS K\ I E \ DMR I K\ VM 1 L \ EFLT VA \VETQ P 
GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWELIDSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
SPLFGTLSPPSETSEPS ILETCALIMKI ICLEI YYWKGSLDQP 
L KDTLKKFS I E KRFA YWSG Y VKS LAVH VAETEG SS CTSLLE YQM 
LVSAWRMLLIIATTHADIMHLTDSVVRRQLFLDVLDGTKALLLV 
PASVNCLRLGSMKCTLLLILLRQWKRELGSVDEILGPLTEILEG 
VLQADQQLME KTKAKVFSAF I TVLQMKE M KVS D I PQ YS QL VLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RIX5VCVI/3LHLAKELCEVDEDGDS WLQVTRRLP I LPTLLTTLE V 
SLRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGITQS 1 CLPL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 
LR YN FLPEAUDFVG VHQERTLQCLNAVRTVQS LACLE EADHT VG 
FILQLSNFMKEraFHLPQLMRDIQVNLGYLCQACTSFLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASECMArJTTVQYGLLKILSKTLAALRHFTPDVCQrLLDQSLDIiA 
EYNFLFALSFTTPTFDSEVAPSFGTLLATVNVAliNMLGELDKKK 
E PLTQAVGLS TQAEGTRTLKS LLMFTMENCF YLLI SQAMR YLRD 
PAVHPRDKQRMKQELSSELSTLLSSLSRYFRRGAPSSPATGVLP 
S PQGKS TS LS KAS PE SQE PL I Q LVQAFVRHMQR 


7129 


1 


1054 


FRRFRWRRRLH*AGPASSAGGSPGEASGTMSGkliPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQIiESARKIVHDYRQGIV 
PPGLTENELWRAKYIYDSAFHPDTGEKMILIGRMSAQVPMNMTI 
TGCMMTFYRTTPAVLFWQWINQSFNAVVNYTNRSGDAPLTVNEL 
GTAYVS ATTG AVATALG LNALTKHVS PLIGRFVPFAAVAAANCI 
N I PLMRQRELKVG I PVTDENGNRLGESANAAKQAITQVWS R I L 
MAAPGMAI P P FI MNTLEKKAFLKRFPWMSAP IQVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGL 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSIiGRKG 
I S AKSQ P YHRS Q SSS S VL INKSMDS I N YPSD VGKQQLLS LHRS S 
RCESHQDLLPDlADSHQQGTEKLSDLTLQDSQKWWNRNLPIiN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELBLSHNRRRKS 
DS KFVDADFSDNVCSGNTLHS LNS P RT P KKP VNS KLGL S P YLTP 
YNDS DKLND Y LWRG PS PNQ QN I VQS LRE KFQ CLSS S S FA 


7131 


805 


573 


AAAEGHI EWKFLI EACKVNP FAKDR WGNIPLDDAVQFNHLE W 
KLLQDYQDS YTLSETQAEAAAEALSKENLESMV 


7132 


1420 


1087 


I DMLLLSGALVSGP YTL ITTAVSADLGTHKSLKGNAHALSTVTA " 
1 1 DGTGS VGAALG PLLAGLLS PS GWSNVFYMLM FADACALLFL I 
RL I HKELS CPGSATGDQVPFKEQ 


7133 * 


2 


3648 


QQIPGIiLPAHGESGDALRKPRLQKPITGHLDDLFFTLYPSLEKF 
E E ELLELHVQDHFQEGCG PLDGGALE I LERRLR VGVHNGLGFVQ 
RPQWVLVPEMDVALTRSASFSRKWSSSKTSSGSQALVLRSRL 
RL PEMVGH PAFAVI FQLE YVFS S PAGVDGNAAS VTS LSNLACMH 
MVRWAVWNPLLEADSGRVTIiPLQGGIQPNPSHCLVYKVPSASMS 
S E EVKQ VESGTLR FQFSLGS EEHLD APTE PVSG P KVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQLAASPRSPTQHCL 
ARPTSQIiPHGSQASPAQAQEFPLEAGISHLEADLSQTSLVLETS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
EILDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFIiAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGALTH I LVP VS RDGTFDAGS PG FQ LR YM VGPG FLKPGERRCFA 
RYLAVQTLQI DVWDGDSLLLIGSAAVQMKHLLRQGRPAVQASHE 
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Amino acid segment containing signal peptide 
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L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WoTryptophan, Y»Tyrosine, X=»Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








LEWATEYEQDNMWSGDMWtXiKVKPIGVHSVVKGRUiLTLAN ' 

VGHPCEQKVRGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 

KHWQAQKLADVDSELAAMUjTHARQGKGPQDVSRESDATRRRK 

LBRMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAYR 

ERTKAESIASLLSLAITTEHTLHATLGVAEFFEFVLKNPHNTQH 

TVTVEIDNPELSVIVDSQEWRDFKGAAGLHTPVEEDMFHLRGSL 

APQLYLRPHETAHVPFKFQSFSAGQLAMVOASPGIjSNEKGMnAV 

SPWKSSAVPTKHAKVLFRASGGKPIAVLCLTVELQPHWDQVFR 

FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 

ICETQNVGPGEPRDIFLKVASGPSPEIKDFFVIXYSDRWLATPT 

QTWQVYLHSLQRVDVSCVAGQLTRLSLVLRGTQTVRKVRAFTSH 

PQEL KTDPKGVF VL PP RGVQDLHVGVRPLRAG S RFVHLNL VDVD 

CHQL VASWLVCLCCRQPL ISKAFE IMLAAGEGKGVNKRI TYTNP 

YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGLQFAPSQRV 

GEEEILIYINDHEDKNEEAFCVKVIYQ 


7134 
7135 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLLEVHYDNPTYEEGLlDNSG 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTIPPGMPEFQSEGHCTL 
ECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
LAYDDDFDFNFQEFQYLKEEQT1LPGDNLITECRYNTKDRAEMT 
WGGLSTRSEMCLSYLLYYPRINLTRCASIPDIMEQLQFIGVKEI 
YRPVTTWPFIIKSPKQYKNLSFMDAMNKFKWTKKEGLSFNKLVL 
3l.PVNVRCSKTDNAEW3IC2GMTALPPDIERPYKAEPl,VCGrSSS 
SSLHRDFS1NLLVCLLLLSCTLSTKSL 




2 


2072 


FVPRVTPRSLSLQGPKGBSVGSITQPLP^S^LIKRAASESDGRC 
WIiDALELALRCSSIjLRLGTCKPGRDGEPGTSPDASPSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
JCTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 
ENKSLMWTLLKQLR PGMDLSRWL P TF VLEPRS FLNKLSD YYYH 
ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG 
ETFRCCW FHPQTD SRT F Yl AE QVSHHP P VS AFH VS NRKDG FC IS 
G S ITAKS R FYGNS LS ALLDG KATLTFLNRAEDYTLTMP YAHCKG 
I L YGTMTIiELGGKVTI ECAKNN FQ AQLE FKLKP FFGGSTS I NQ I 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRLRQHTVPLEEQTELESERLWQHVTRAI S KGDQHRATQ EKFAL 
EEAQRQRARERQESLMPWKPQLFHLDPITQBWHYRYEDHSPWDP 
LKDIAQFEQDGILRTLQQEAVARQTTFLGSPGPRHERSGPDQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILSIREAQQBLHRHLSAMLSSTARAAQA 
PTPGLLQSPRSWFLliCVFLACQLFINHILK 


7136 
7137 


2 


418 


DF VP S FRRP SGNT& QTVWLLRAkTLE KE VAGLRE KI HHLDDMLK 
SQQRKVRQMIEQLONSKAVIQSKDATIQELKEKXAYLEAEJNLEM 
HDRMEHL I EKQISHGNFSTQARAKTENPGS IR I S KPPS P KPMPV 
IRWET 




2 


466 


WASGMSTVPGGSRH S LG I Q VRGG WG VTGG EE ES LT VP VADT WQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDDLDGNTN KRS KEVRVLQ EMQLLQVAAMN YRLRPLE KFVTY FT 
RMEQLSDKESYKLSCQLEPENP 


7138 
7139 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDDLDGNTNKRS KEVR VLQE MQL LQVAAMN YRLR PLEKFVTYFT 
RMEQLSDKESYKLSCQLEPENP 


7140 


1 


357 


SLRNSARGLKWAASAARGAAALRRSINQPVAFVRRIPWTAASSQ 
LKEHFAQFGHVRRCILPFDKBTGFHRGLGWVQFSSEEGLRNALQ 
QENTHI IDGVKVQVHTRRPKLPQTSDDEKKDF 




1401 


1957 


RASSLQVLKAWGGLIPSSFQQQHTGQYALEELFDLKVYDCFCSF 
NMNVSLEKQLRPSQPWPRGKCRKTPGWEEARPKAQDLRGDLGKT 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

cUIIaIIO aclQ 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C»Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Prolme, Q=Glutanune, R=Arginine, 
S=Serine, T=Threonine, V<=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 

WTPKGQDPPLMFSEDYQKSLLEQYHLGIiDQKLRKYWGEJjIWNF 
ADFMTNQCG 


7141 
7142 


124 


t mi 


LUSRSCWLDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDlTVL 

VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEEILD 

EANRLAAQLEQCAIiQDRESAGEGLGPRRVKPSPRRETFVLKDSP 

VRDLLPT VNSLTRSTPS /LKQPDASTPE * * * EGVSQGSPG YI WK 

EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 

AASEETRAAKLRGAAAKS S CQLP I PS AI PRPASRMPLTSRS VPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7143 


658 


839 


LIFLMLHMELKMLSSVTLHIRAFLYWICLKPTSCLIFQlvfVLNLL 
KK*SRAVGWWMCRT/YSSDLQVGVIKPWLLLGSQDAAHDLDT 
LKKNKVTHILNVAYGVENAFLSDFTYKSISILDLPETNILSYFP 
ECFEFIEEAKRKDGWIiVHCNA 


7144 


3 


773 


SI.EWSSDGEPbSRMDSEDSISSTIMDVDSTISSGRSTPAM^GQ 
GSTTSSSKNIAYNCCWDQCQACFNSSPDLADHIRSIHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
SFASQGGIARHVPTHFSQQNSSKVSSQPKAKEESPSKAGMNKRR 
KLKNKRRRSIiARPHDFFDAQTLDAIRHRAICFNLSAHlESLGKG 
HSWFHSTVSILLFFQIKYKTLQKNISTIISK5LKI 




1 


988 


FR VNMQDGGPS PAEHS KAEES AGMEAR FT^r.PiwarccpDTDnn — 
RCPAPRPAGVS YVIRDE VEKYNRNGVNALQLDPALNRLFTAGRD 
SIIRIWSVNQHKQDPYIASMEHHTDWVNDIVLCCNGKTLISASS 
DTT VKVWNAH KGFCM STLRTHKD YVKALAYAKDKE LVAS AGLDR 
QIFLWDVNTLTALTASNNTVTTSSLSGNKDS IYSLAMNQLGTI I 
VSGSTEKVLRVWDPRTCAKLMKLKGHTDNVKALLLNRDGTQCLS 
GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKIYCTDLRNPDIRVLICB 



TRADOCS:1416260.l(%CSK0l!.DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide, comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:U786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selectedfrom the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 1 0. 

• 1 3. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the . 
polynucleotide of claim 1 is detected. 

1 4. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:l-1786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-1 786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection, 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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